Re: [RFC PATCH 9/9] powerpc: rewrite local_t using soft_irq

2016-07-25 Thread Nicholas Piggin
On Mon, 25 Jul 2016 20:22:22 +0530
Madhavan Srinivasan  wrote:

> https://lkml.org/lkml/2008/12/16/450
> 
> Modifications to Rusty's benchmark code:
>  - Executed only local_t test
> 
> Here are the values with the patch.
> 
> Time in ns per iteration
> 
> Local_t   Without Patch   With Patch
> 
> _inc  28  8
> _add  28  8
> _read 3   3
> _add_return   28  7
> 
> Tested the patch in a
>  - pSeries LPAR (with perf record)

Very nice. I'd like to see these patches get in. We can
probably use the feature in other places too.

Thanks,
Nick
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH 8/9] powerpc: Support to replay PMIs

2016-07-25 Thread Nicholas Piggin
On Mon, 25 Jul 2016 20:22:21 +0530
Madhavan Srinivasan  wrote:

> Code to replay the Performance Monitoring Interrupts(PMI).
> In the masked_interrupt handler, for PMIs we reset the MSR[EE]
> and return. This is due the fact that PMIs are level triggered.
> In the __check_irq_replay(), we enabled the MSR[EE] which will
> fire the interrupt for us.
> 
> Patch also adds a new arch_local_irq_disable_var() variant. New
> variant takes an input value to write to the paca->soft_enabled.
> This will be used in following patch to implement the tri-state
> value for soft-enabled.

Same comment also applies about patches being standalone
transformations that work before and after. Some of these
can be squashed together I think.


> Signed-off-by: Madhavan Srinivasan 
> ---
>  arch/powerpc/include/asm/hw_irq.h | 14 ++
>  arch/powerpc/kernel/irq.c |  9 -
>  2 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/hw_irq.h
> b/arch/powerpc/include/asm/hw_irq.h index cc69dde6eb84..863179654452
> 100644 --- a/arch/powerpc/include/asm/hw_irq.h
> +++ b/arch/powerpc/include/asm/hw_irq.h
> @@ -81,6 +81,20 @@ static inline unsigned long
> arch_local_irq_disable(void) return flags;
>  }
>  
> +static inline unsigned long arch_local_irq_disable_var(int value)
> +{
> + unsigned long flags, zero;
> +
> + asm volatile(
> + "li %1,%3; lbz %0,%2(13); stb %1,%2(13)"
> + : "=r" (flags), "=" (zero)
> + : "i" (offsetof(struct paca_struct, soft_enabled)),\
> +   "i" (value)
> + : "memory");
> +
> + return flags;
> +}

arch_ function suggests it is arch implementation of a generic
kernel function or something. I think our soft interrupt levels
are just used in powerpc specific code.

The name could also be a little more descriptive.

I would have our internal function be something like

soft_irq_set_level(), and then the arch disable just sets to
the appropriate level as it does today.

The PMU disable level could be implemented in powerpc specific
header with local_irq_and_pmu_disable() or something like that.

Thanks,
Nick
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH 7/9] powerpc: Add support to mask perf interrupts

2016-07-25 Thread Nicholas Piggin
On Mon, 25 Jul 2016 20:22:20 +0530
Madhavan Srinivasan  wrote:

> To support masking of the PMI interrupts, couple of new interrupt
> handler macros are added MASKABLE_EXCEPTION_PSERIES_OOL and
> MASKABLE_RELON_EXCEPTION_PSERIES_OOL. These are needed to include the
> SOFTEN_TEST and implement the support at both host and guest kernel.
> 
> Couple of new irq #defs "PACA_IRQ_PMI" and "SOFTEN_VALUE_0xf0*" added
> to use in the exception code to check for PMI interrupts.
> 
> __SOFTEN_TEST macro is modified to support the PMI interrupt.
> Present __SOFTEN_TEST code loads the soft_enabled from paca and check
> to call masked_interrupt handler code. To support both current
> behaviour and PMI masking, these changes are added,
> 
> 1) Current LR register content are saved in R11
> 2) "bge" branch operation is changed to "bgel".
> 3) restore R11 to LR
> 
> Reason:
> 
> To retain PMI as NMI behaviour for flag state of 1, we save the LR
> regsiter value in R11 and branch to "masked_interrupt" handler with
> LR update. And in "masked_interrupt" handler, we check for the
> "SOFTEN_VALUE_*" value in R10 for PMI and branch back with "blr" if
> PMI.
> 
> To mask PMI for a flag >1 value, masked_interrupt vaoid's the above
> check and continue to execute the masked_interrupt code and disabled
> MSR[EE] and updated the irq_happend with PMI info.
> 
> Finally, saving of R11 is moved before calling SOFTEN_TEST in the
> __EXCEPTION_PROLOG_1 macro to support saving of LR values in
> SOFTEN_TEST.
> 
> Signed-off-by: Madhavan Srinivasan 
> ---
>  arch/powerpc/include/asm/exception-64s.h | 22 --
>  arch/powerpc/include/asm/hw_irq.h|  1 +
>  arch/powerpc/kernel/exceptions-64s.S | 27
> --- 3 files changed, 45 insertions(+), 5
> deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/exception-64s.h
> b/arch/powerpc/include/asm/exception-64s.h index
> 44d3f539d8a5..c951b7ab5108 100644 ---
> a/arch/powerpc/include/asm/exception-64s.h +++
> b/arch/powerpc/include/asm/exception-64s.h @@ -166,8 +166,8 @@
> END_FTR_SECTION_NESTED(ftr,ftr,943)
> OPT_SAVE_REG_TO_PACA(area+EX_CFAR, r10, CPU_FTR_CFAR);
> \ SAVE_CTR(r10, area);
> \ mfcr
> r9;   \
> -
> extra(vec);   \
> std
> r11,area+EX_R11(r13); \
> +
> extra(vec);   \
> std
> r12,area+EX_R12(r13); \
> GET_SCRATCH0(r10);\
> std   r10,area+EX_R13(r13) @@ -403,12 +403,17 @@
> label##_relon_hv: \
> #define SOFTEN_VALUE_0xe82PACA_IRQ_DBELL #define
> SOFTEN_VALUE_0xe60PACA_IRQ_HMI #define
> SOFTEN_VALUE_0xe62PACA_IRQ_HMI +#define
> SOFTEN_VALUE_0xf01PACA_IRQ_PMI +#define
> SOFTEN_VALUE_0xf00PACA_IRQ_PMI

#define __SOFTEN_TEST(h,
> vec)  \ lbz
> r10,PACASOFTIRQEN(r13);   \
> cmpwi
> r10,LAZY_INTERRUPT_DISABLED;  \
> li
> r10,SOFTEN_VALUE_##vec;   \
> - bge masked_##h##interrupt

At which point, can't we pass in the interrupt level we want to mask
for to SOFTEN_TEST, and avoid all this extra code changes?


PMU masked interrupt will compare with SOFTEN_LEVEL_PMU, existing
interrupts will compare with SOFTEN_LEVEL_EE (or whatever suitable
names there are).


> + mflr
> r11;  \
> + bgel
> masked_##h##interrupt;\
> + mtlrr11;

This might corrupt return prediction when masked_interrupt does not
return. I guess that's uncommon case though. But I think we can avoid
this if we do the above, no?

Thanks,
Nick
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH 6/9] powerpc: modify __SOFTEN_TEST to support tri-state soft_enabled flag

2016-07-25 Thread Nicholas Piggin
On Mon, 25 Jul 2016 20:22:19 +0530
Madhavan Srinivasan  wrote:

> Foundation patch to support checking of new flag for
> "paca->soft_enabled". Modify the condition checking for the
> "soft_enabled" from "equal" to "greater than or equal to".

Rather than a "tri-state" and the mystery "2" state, can you
make a #define for that guy, and use levels.

0-> all enabled
1-> "linux" interrupts disabled
2-> PMU also disabled
etc.

Thanks,
Nick
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powernv/pci: Add PHB register dump debugfs handle

2016-07-25 Thread Russell Currey
On Tue, 2016-07-26 at 11:45 +1000, Michael Ellerman wrote:
> Quoting Russell Currey (2016-07-22 15:23:36)
> > 
> > On EEH events the kernel will print a dump of relevant registers.
> > If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform
> > doesn't have EEH support, etc) this information isn't readily available.
> > 
> > Add a new debugfs handler to trigger a PHB register dump, so that this
> > information can be made available on demand.
> 
> This is a bit weird.
> 
> It's a debugfs file, but when you read from it you get nothing (I think,
> you have no read() defined).
> 
> When you write to it, regardless of what you write, the kernel spits
> some stuff out to dmesg and throws away whatever you wrote.
> 
> Ideally pnv_pci_dump_phb_diag_data() would write its output to a buffer,
> which we could then either send to dmesg, or give to debugfs. But that
> might be more work than we want to do for this.
> 
> If we just want a trigger file, then I think it'd be preferable to just
> use a simple attribute, with a set and no show, eg. something like:
> 
> static int foo_set(void *data, u64 val)
> {
> if (val != 1)
> return -EINVAL;
> 
> ...
> 
> return 0;
> }
> 
> DEFINE_SIMPLE_ATTRIBUTE(fops_foo, NULL, foo_set, "%llu\n");
> 
> That requires that you write "1" to the file to trigger the reg dump.

I don't think I can use this here.  Triggering the diag dump on the given PHB
(these are in /sys/kernel/debug/powerpc/PCI), and that PHB is retrieved from
the file handler.  It looks like I have no access to the file struct if using a
simple getter/setter.

> 
> 
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c
> > b/arch/powerpc/platforms/powernv/pci-ioda.c
> > index 891fc4a..ada2f3c 100644
> > --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> > @@ -3036,6 +3068,9 @@ static void pnv_pci_ioda_create_dbgfs(void)
> > if (!phb->dbgfs)
> > pr_warning("%s: Error on creating debugfs on
> > PHB#%x\n",
> > __func__, hose->global_number);
> > +
> > +   debugfs_create_file("regdump", 0200, phb->dbgfs, hose,
> > +   _pci_debug_ops);
> > }
> 
> You shouldn't be trying to create the file if the directory create failed. So
> the check for (!phb->dbgfs) should probably print and then continue.

Good catch.

> 
> And a better name would be "dump-regs", because it indicates that the file
> does
> something, rather than is something.

That is indeed better.

> 
> cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH 5/9] powerpc: reverse the soft_enable logic

2016-07-25 Thread Nicholas Piggin
On Mon, 25 Jul 2016 20:22:18 +0530
Madhavan Srinivasan  wrote:

> "paca->soft_enabled" is used as a flag to mask some of interrupts.
> Currently supported flags values and their details:
> 
> soft_enabled  MSR[EE]
> 
> 0 0   Disabled (PMI and HMI not masked)
> 1 1   Enabled
> 
> "paca->soft_enabled" is initialed to 1 to make the interripts as
> enabled. arch_local_irq_disable() will toggle the value when
> interrupts needs to disbled. At this point, the interrupts are not
> actually disabled, instead, interrupt vector has code to check for
> the flag and mask it when it occurs. By "mask it", it updated
> interrupt paca->irq_happened and return. arch_local_irq_restore() is
> called to re-enable interrupts, which checks and replays interrupts
> if any occured.
> 
> Now, as mentioned, current logic doesnot mask "performance monitoring
> interrupts" and PMIs are implemented as NMI. But this patchset
> depends on local_irq_* for a successful local_* update. Meaning, mask
> all possible interrupts during local_* update and replay them after
> the update.
> 
> So the idea here is to reserve the "paca->soft_enabled" logic. New
> values and details:
> 
> soft_enabled  MSR[EE]
> 
> 1 0   Disabled  (PMI and HMI not masked)
> 0 1   Enabled
> 
> Reason for the this change is to create foundation for a third flag
> value "2" for "soft_enabled" to add support to mask PMIs. When
> arch_irq_disable_* is called with a value "2", PMI interrupts are
> mask. But when called with a value of "1", PMI are not mask.
> 
> With new flag value for "soft_enabled", states looks like:
> 
> soft_enabled  MSR[EE]
> 
> 2 0   Disbaled PMIs also
> 1 0   Disabled  (PMI and HMI not masked)
> 0 1   Enabled
> 
> And interrupt handler code for checking has been modified to check for
> for "greater than or equal" to 1 condition instead.

This bit of the patch seems to have been moved into other part
of the series. Ideally (unless there is a good reason), it is nice
to have each individual patch result in a working kernel before
and after.

Nice way to avoid adding more branches though.

Thanks,
Nick
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/2] powerpc/64: Do load of PACAKBASE in LOAD_HANDLER

2016-07-25 Thread Michael Ellerman
The LOAD_HANDLER macro requires that you have previously loaded "reg"
with PACAKBASE. Although that gives callers flexibility to get PACAKBASE
in some interesting way, none of the callers actually do that. So fold
the load of PACAKBASE into the macro, making it simpler for callers to
use correctly.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/exception-64s.h |  3 +--
 arch/powerpc/kernel/exceptions-64s.S | 10 --
 2 files changed, 1 insertion(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 4ff3e2f16b5d..887867ac4bfa 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -52,7 +52,6 @@
 
 #ifdef CONFIG_RELOCATABLE
 #define __EXCEPTION_RELON_PROLOG_PSERIES_1(label, h)   \
-   ld  r12,PACAKBASE(r13); /* get high part of  */   \
mfspr   r11,SPRN_##h##SRR0; /* save SRR0 */ \
LOAD_HANDLER(r12,label);\
mtctr   r12;\
@@ -90,6 +89,7 @@
  * that kernelbase be 64K aligned.
  */
 #define LOAD_HANDLER(reg, label)   \
+   ld  reg,PACAKBASE(r13); /* get high part of  */   \
ori reg,reg,(label)-_stext; /* virt addr of handler ... */
 
 /* Exception register prefixes */
@@ -175,7 +175,6 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
__EXCEPTION_PROLOG_1(area, extra, vec)
 
 #define __EXCEPTION_PROLOG_PSERIES_1(label, h) \
-   ld  r12,PACAKBASE(r13); /* get high part of  */   \
ld  r10,PACAKMSR(r13);  /* get MSR value for kernel */  \
mfspr   r11,SPRN_##h##SRR0; /* save SRR0 */ \
LOAD_HANDLER(r12,label) \
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 8bcc1b457115..af30f26c35d8 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -41,7 +41,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
\
 
 #define SYSCALL_PSERIES_2_RFID \
mfspr   r12,SPRN_SRR1 ; \
-   ld  r10,PACAKBASE(r13) ;\
LOAD_HANDLER(r10, system_call_entry) ;  \
mtspr   SPRN_SRR0,r10 ; \
ld  r10,PACAKMSR(r13) ; \
@@ -64,7 +63,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
\
 */
 #define SYSCALL_PSERIES_2_DIRECT   \
mflrr10 ;   \
-   ld  r12,PACAKBASE(r13) ;\
LOAD_HANDLER(r12, system_call_entry) ;  \
mtctr   r12 ;   \
mfspr   r12,SPRN_SRR1 ; \
@@ -219,7 +217,6 @@ data_access_slb_pSeries:
 * the kernel ends up being put.
 */
mfctr   r11
-   ld  r10,PACAKBASE(r13)
LOAD_HANDLER(r10, slb_miss_realmode)
mtctr   r10
bctr
@@ -240,7 +237,6 @@ instruction_access_slb_pSeries:
b   slb_miss_realmode
 #else
mfctr   r11
-   ld  r10,PACAKBASE(r13)
LOAD_HANDLER(r10, slb_miss_realmode)
mtctr   r10
bctr
@@ -486,7 +482,6 @@ BEGIN_FTR_SECTION
mfmsr   r11 /* get MSR value */
ori r11,r11,MSR_ME  /* turn on ME bit */
ori r11,r11,MSR_RI  /* turn on RI bit */
-   ld  r12,PACAKBASE(r13)  /* get high part of  */
LOAD_HANDLER(r12, machine_check_handle_early)
 1: mtspr   SPRN_SRR0,r12
mtspr   SPRN_SRR1,r11
@@ -499,7 +494,6 @@ BEGIN_FTR_SECTION
 */
addir1,r1,INT_FRAME_SIZE/* go back to previous stack frame */
ld  r11,PACAKMSR(r13)
-   ld  r12,PACAKBASE(r13)
LOAD_HANDLER(r12, unrecover_mce)
li  r10,MSR_ME
andcr11,r11,r10 /* Turn off MSR_ME */
@@ -802,7 +796,6 @@ data_access_slb_relon_pSeries:
 * the kernel ends up being put.
 */
mfctr   r11
-   ld  r10,PACAKBASE(r13)
LOAD_HANDLER(r10, slb_miss_realmode)
mtctr   r10
bctr
@@ -822,7 +815,6 @@ instruction_access_slb_relon_pSeries:
b   slb_miss_realmode
 #else
mfctr   r11
-   ld  r10,PACAKBASE(r13)
LOAD_HANDLER(r10, slb_miss_realmode)
mtctr   r10
bctr
@@ -1321,7 +1313,6 @@ machine_check_handle_early:
andi.   r11,r12,MSR_RI
bne 2f
 1: mfspr   r11,SPRN_SRR0
-   ld  r10,PACAKBASE(r13)

[PATCH 1/2] powerpc/64: Correct comment on LOAD_HANDLER()

2016-07-25 Thread Michael Ellerman
The comment for LOAD_HANDLER() was wrong. The part about kdump has not
been true since 1f6a93e4c35e ("powerpc: Make it possible to move the
interrupt handlers away from the kernel").

Describe how it currently works, and combine the two separate comments
into one.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/exception-64s.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 93ae809fe5ea..4ff3e2f16b5d 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -84,12 +84,12 @@
 
 /*
  * We're short on space and time in the exception prolog, so we can't
- * use the normal SET_REG_IMMEDIATE macro. Normally we just need the
- * low halfword of the address, but for Kdump we need the whole low
- * word.
+ * use the normal LOAD_REG_IMMEDIATE macro to load the address of label.
+ * Instead we get the base of the kernel from paca->kernelbase and or in the 
low
+ * part of label. This requires that the label be within 64KB of kernelbase, 
and
+ * that kernelbase be 64K aligned.
  */
 #define LOAD_HANDLER(reg, label)   \
-   /* Handlers must be within 64K of kbase, which must be 64k aligned */ \
ori reg,reg,(label)-_stext; /* virt addr of handler ... */
 
 /* Exception register prefixes */
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH 1/9] Add #defs for paca->soft_enabled flags

2016-07-25 Thread Nicholas Piggin
On Mon, 25 Jul 2016 20:22:14 +0530
Madhavan Srinivasan  wrote:

> Two #defs LAZY_INTERRUPT_ENABLED and
> LAZY_INTERRUPT_DISABLED are added to be used
> when updating paca->soft_enabled.

This is a very nice patchset, but can this not be a new name?
We use "soft enabled/disabled" everywhere for it. I think lazy
is an implementation detail anyway because some interrupts don't
cause a hard disable at all.

Thanks,
Nick
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 02/11] mm: Hardened usercopy

2016-07-25 Thread Kees Cook
On Mon, Jul 25, 2016 at 7:03 PM, Michael Ellerman  wrote:
> Josh Poimboeuf  writes:
>
>> On Thu, Jul 21, 2016 at 11:34:25AM -0700, Kees Cook wrote:
>>> On Wed, Jul 20, 2016 at 11:52 PM, Michael Ellerman  
>>> wrote:
>>> > Kees Cook  writes:
>>> >
>>> >> diff --git a/mm/usercopy.c b/mm/usercopy.c
>>> >> new file mode 100644
>>> >> index ..e4bf4e7ccdf6
>>> >> --- /dev/null
>>> >> +++ b/mm/usercopy.c
>>> >> @@ -0,0 +1,234 @@
>>> > ...
>>> >> +
>>> >> +/*
>>> >> + * Checks if a given pointer and length is contained by the current
>>> >> + * stack frame (if possible).
>>> >> + *
>>> >> + *   0: not at all on the stack
>>> >> + *   1: fully within a valid stack frame
>>> >> + *   2: fully on the stack (when can't do frame-checking)
>>> >> + *   -1: error condition (invalid stack position or bad stack frame)
>>> >> + */
>>> >> +static noinline int check_stack_object(const void *obj, unsigned long 
>>> >> len)
>>> >> +{
>>> >> + const void * const stack = task_stack_page(current);
>>> >> + const void * const stackend = stack + THREAD_SIZE;
>>> >
>>> > That allows access to the entire stack, including the struct thread_info,
>>> > is that what we want - it seems dangerous? Or did I miss a check
>>> > somewhere else?
>>>
>>> That seems like a nice improvement to make, yeah.
>>>
>>> > We have end_of_stack() which computes the end of the stack taking
>>> > thread_info into account (end being the opposite of your end above).
>>>
>>> Amusingly, the object_is_on_stack() check in sched.h doesn't take
>>> thread_info into account either. :P Regardless, I think using
>>> end_of_stack() may not be best. To tighten the check, I think we could
>>> add this after checking that the object is on the stack:
>>>
>>> #ifdef CONFIG_STACK_GROWSUP
>>> stackend -= sizeof(struct thread_info);
>>> #else
>>> stack += sizeof(struct thread_info);
>>> #endif
>>>
>>> e.g. then if the pointer was in the thread_info, the second test would
>>> fail, triggering the protection.
>>
>> FWIW, this won't work right on x86 after Andy's
>> CONFIG_THREAD_INFO_IN_TASK patches get merged.
>
> Yeah. I wonder if it's better for the arch helper to just take the obj and 
> len,
> and work out it's own bounds for the stack using current and whatever makes
> sense on that arch.
>
> It would avoid too much ifdefery in the generic code, and also avoid any
> confusion about whether stackend is the high or low address.
>
> eg. on powerpc we could do:
>
> int noinline arch_within_stack_frames(const void *obj, unsigned long len)
> {
> void *stack_low  = end_of_stack(current);
> void *stack_high = task_stack_page(current) + THREAD_SIZE;
>
>
> Whereas arches with STACK_GROWSUP=y could do roughly the reverse, and x86 can 
> do
> whatever it needs to depending on whether the thread_info is on or off stack.
>
> cheers

Yeah, I agree: this should be in the arch code. If the arch can
actually do frame checking, the thread_info (if it exists on the
stack) would already be excluded. But it'd be a nice tightening of the
check.

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc: sgy_cts1000: Fix gpio_halt_cb()'s signature

2016-07-25 Thread Andrey Smirnov
Halt callback in struct machdep_calls is declared with __noreturn
attribute, so omitting that attribute in gpio_halt_cb()'s signatrue
results in compilation error.

Change the signature to address the problem as well as change the code
of the function to avoid ever returning from the function.

Signed-off-by: Andrey Smirnov 
---
 arch/powerpc/platforms/85xx/sgy_cts1000.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/sgy_cts1000.c 
b/arch/powerpc/platforms/85xx/sgy_cts1000.c
index 79fd0df..21d6aaa 100644
--- a/arch/powerpc/platforms/85xx/sgy_cts1000.c
+++ b/arch/powerpc/platforms/85xx/sgy_cts1000.c
@@ -38,18 +38,18 @@ static void gpio_halt_wfn(struct work_struct *work)
 }
 static DECLARE_WORK(gpio_halt_wq, gpio_halt_wfn);
 
-static void gpio_halt_cb(void)
+static void __noreturn gpio_halt_cb(void)
 {
enum of_gpio_flags flags;
int trigger, gpio;
 
if (!halt_node)
-   return;
+   panic("No reset GPIO information was provided in DT\n");
 
gpio = of_get_gpio_flags(halt_node, 0, );
 
if (!gpio_is_valid(gpio))
-   return;
+   panic("Provided GPIO is invalid\n");
 
trigger = (flags == OF_GPIO_ACTIVE_LOW);
 
@@ -57,6 +57,8 @@ static void gpio_halt_cb(void)
 
/* Probably wont return */
gpio_set_value(gpio, trigger);
+
+   panic("Halt failed\n");
 }
 
 /* This IRQ means someone pressed the power button and it is waiting for us
-- 
2.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/2] powerpc: e8248e: Select PHYLIB only if NETDEVICES is enabled

2016-07-25 Thread Andrey Smirnov
Select PHYLIB only if NETDEVICES is enabled and MDIO_BITBANG only if
PHYLIB is present to avoid warnings from Kconfig.

To prevent undefined references during linking register MDIO driver only
if CONFIG_MDIO_BITBANG is enabled.

Signed-off-by: Andrey Smirnov 
---
 arch/powerpc/platforms/82xx/Kconfig   | 4 ++--
 arch/powerpc/platforms/82xx/ep8248e.c | 4 +++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/82xx/Kconfig 
b/arch/powerpc/platforms/82xx/Kconfig
index 7c7df400..994d1a9 100644
--- a/arch/powerpc/platforms/82xx/Kconfig
+++ b/arch/powerpc/platforms/82xx/Kconfig
@@ -30,8 +30,8 @@ config EP8248E
select 8272
select 8260
select FSL_SOC
-   select PHYLIB
-   select MDIO_BITBANG
+   select PHYLIB if NETDEVICES
+   select MDIO_BITBANG if PHYLIB
help
  This enables support for the Embedded Planet EP8248E board.
 
diff --git a/arch/powerpc/platforms/82xx/ep8248e.c 
b/arch/powerpc/platforms/82xx/ep8248e.c
index cdab847..8fec050 100644
--- a/arch/powerpc/platforms/82xx/ep8248e.c
+++ b/arch/powerpc/platforms/82xx/ep8248e.c
@@ -298,7 +298,9 @@ static const struct of_device_id of_bus_ids[] __initconst = 
{
 static int __init declare_of_platform_devices(void)
 {
of_platform_bus_probe(NULL, of_bus_ids, NULL);
-   platform_driver_register(_mdio_driver);
+
+   if (IS_ENABLED(CONFIG_MDIO_BITBANG))
+   platform_driver_register(_mdio_driver);
 
return 0;
 }
-- 
2.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] powerpc: mpc85xx_mds: Select PHYLIB only if NETDEVICES is enabled

2016-07-25 Thread Andrey Smirnov
PHYLIB depends on NETDEVICES, so to avoid unmet dependencies warning
from Kconfig it needs to be selected conditionally.

Also add checks if PHYLIB is built-in to avoid undefined references to
PHYLIB's symbols.

Signed-off-by: Andrey Smirnov 
---
 arch/powerpc/platforms/85xx/Kconfig   | 2 +-
 arch/powerpc/platforms/85xx/mpc85xx_mds.c | 9 -
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/Kconfig 
b/arch/powerpc/platforms/85xx/Kconfig
index e626461..3da35bc 100644
--- a/arch/powerpc/platforms/85xx/Kconfig
+++ b/arch/powerpc/platforms/85xx/Kconfig
@@ -72,7 +72,7 @@ config MPC85xx_CDS
 config MPC85xx_MDS
bool "Freescale MPC85xx MDS"
select DEFAULT_UIMAGE
-   select PHYLIB
+   select PHYLIB if NETDEVICES
select HAS_RAPIDIO
select SWIOTLB
help
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c 
b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
index dbcb467..71aff5e 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
@@ -63,6 +63,8 @@
 #define DBG(fmt...)
 #endif
 
+#if IS_BUILTIN(CONFIG_PHYLIB)
+
 #define MV88E_SCR  0x10
 #define MV88E_SCR_125CLK   0x0010
 static int mpc8568_fixup_125_clock(struct phy_device *phydev)
@@ -152,6 +154,8 @@ static int mpc8568_mds_phy_fixups(struct phy_device *phydev)
return err;
 }
 
+#endif
+
 /* 
  *
  * Setup the architecture
@@ -313,6 +317,7 @@ static void __init mpc85xx_mds_setup_arch(void)
swiotlb_detect_4g();
 }
 
+#if IS_BUILTIN(CONFIG_PHYLIB)
 
 static int __init board_fixups(void)
 {
@@ -342,9 +347,12 @@ static int __init board_fixups(void)
 
return 0;
 }
+
 machine_arch_initcall(mpc8568_mds, board_fixups);
 machine_arch_initcall(mpc8569_mds, board_fixups);
 
+#endif
+
 static int __init mpc85xx_publish_devices(void)
 {
if (machine_is(mpc8568_mds))
@@ -435,4 +443,3 @@ define_machine(p1021_mds) {
.pcibios_fixup_phb  = fsl_pcibios_fixup_phb,
 #endif
 };
-
-- 
2.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/3] powerpc: Convert fsl_rstcr_restart to a reset handler

2016-07-25 Thread Andrey Smirnov
Convert fsl_rstcr_restart into a function to be registered with
register_reset_handler() API and introduce fls_rstcr_restart_register()
function that can be added as an initcall that would do aforementioned
registration.

Signed-off-by: Andrey Smirnov 
---
 arch/powerpc/platforms/85xx/bsc913x_qds.c |  2 +-
 arch/powerpc/platforms/85xx/bsc913x_rdb.c |  2 +-
 arch/powerpc/platforms/85xx/c293pcie.c|  2 +-
 arch/powerpc/platforms/85xx/corenet_generic.c |  2 +-
 arch/powerpc/platforms/85xx/ge_imp3a.c|  2 +-
 arch/powerpc/platforms/85xx/mpc8536_ds.c  |  2 +-
 arch/powerpc/platforms/85xx/mpc85xx_ads.c |  2 +-
 arch/powerpc/platforms/85xx/mpc85xx_cds.c | 26 +++---
 arch/powerpc/platforms/85xx/mpc85xx_ds.c  |  7 ---
 arch/powerpc/platforms/85xx/mpc85xx_mds.c |  7 ---
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 21 +++--
 arch/powerpc/platforms/85xx/mvme2500.c|  2 +-
 arch/powerpc/platforms/85xx/p1010rdb.c|  2 +-
 arch/powerpc/platforms/85xx/p1022_ds.c|  2 +-
 arch/powerpc/platforms/85xx/p1022_rdk.c   |  3 ++-
 arch/powerpc/platforms/85xx/p1023_rdb.c   |  2 +-
 arch/powerpc/platforms/85xx/ppa8548.c |  2 +-
 arch/powerpc/platforms/85xx/qemu_e500.c   |  2 +-
 arch/powerpc/platforms/85xx/sbc8548.c |  2 +-
 arch/powerpc/platforms/85xx/socrates.c|  2 +-
 arch/powerpc/platforms/85xx/stx_gp3.c |  2 +-
 arch/powerpc/platforms/85xx/tqm85xx.c |  2 +-
 arch/powerpc/platforms/85xx/twr_p102x.c   |  2 +-
 arch/powerpc/platforms/85xx/xes_mpc85xx.c |  7 ---
 arch/powerpc/platforms/86xx/gef_ppc9a.c   |  2 +-
 arch/powerpc/platforms/86xx/gef_sbc310.c  |  2 +-
 arch/powerpc/platforms/86xx/gef_sbc610.c  |  2 +-
 arch/powerpc/platforms/86xx/mpc8610_hpcd.c|  2 +-
 arch/powerpc/platforms/86xx/mpc86xx_hpcn.c|  2 +-
 arch/powerpc/platforms/86xx/sbc8641d.c|  2 +-
 arch/powerpc/sysdev/fsl_soc.c | 22 +-
 arch/powerpc/sysdev/fsl_soc.h |  2 +-
 32 files changed, 86 insertions(+), 57 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/bsc913x_qds.c 
b/arch/powerpc/platforms/85xx/bsc913x_qds.c
index 07dd6ae..14ea7a0 100644
--- a/arch/powerpc/platforms/85xx/bsc913x_qds.c
+++ b/arch/powerpc/platforms/85xx/bsc913x_qds.c
@@ -53,6 +53,7 @@ static void __init bsc913x_qds_setup_arch(void)
 }
 
 machine_arch_initcall(bsc9132_qds, mpc85xx_common_publish_devices);
+machine_arch_initcall(bsc9133_qds, fsl_rstcr_restart_register);
 
 /*
  * Called very early, device-tree isn't unflattened
@@ -72,7 +73,6 @@ define_machine(bsc9132_qds) {
.pcibios_fixup_bus  = fsl_pcibios_fixup_bus,
 #endif
.get_irq= mpic_get_irq,
-   .restart= fsl_rstcr_restart,
.calibrate_decr = generic_calibrate_decr,
.progress   = udbg_progress,
 };
diff --git a/arch/powerpc/platforms/85xx/bsc913x_rdb.c 
b/arch/powerpc/platforms/85xx/bsc913x_rdb.c
index e48f671..cd4e717 100644
--- a/arch/powerpc/platforms/85xx/bsc913x_rdb.c
+++ b/arch/powerpc/platforms/85xx/bsc913x_rdb.c
@@ -43,6 +43,7 @@ static void __init bsc913x_rdb_setup_arch(void)
 }
 
 machine_device_initcall(bsc9131_rdb, mpc85xx_common_publish_devices);
+machine_arch_initcall(bsc9131_rdb, fsl_rstcr_restart_register);
 
 /*
  * Called very early, device-tree isn't unflattened
@@ -59,7 +60,6 @@ define_machine(bsc9131_rdb) {
.setup_arch = bsc913x_rdb_setup_arch,
.init_IRQ   = bsc913x_rdb_pic_init,
.get_irq= mpic_get_irq,
-   .restart= fsl_rstcr_restart,
.calibrate_decr = generic_calibrate_decr,
.progress   = udbg_progress,
 };
diff --git a/arch/powerpc/platforms/85xx/c293pcie.c 
b/arch/powerpc/platforms/85xx/c293pcie.c
index 3b9e3f0..fbd63f9 100644
--- a/arch/powerpc/platforms/85xx/c293pcie.c
+++ b/arch/powerpc/platforms/85xx/c293pcie.c
@@ -48,6 +48,7 @@ static void __init c293_pcie_setup_arch(void)
 }
 
 machine_arch_initcall(c293_pcie, mpc85xx_common_publish_devices);
+machine_arch_initcall(c293_pcie, fsl_rstcr_restart_register);
 
 /*
  * Called very early, device-tree isn't unflattened
@@ -65,7 +66,6 @@ define_machine(c293_pcie) {
.setup_arch = c293_pcie_setup_arch,
.init_IRQ   = c293_pcie_pic_init,
.get_irq= mpic_get_irq,
-   .restart= fsl_rstcr_restart,
.calibrate_decr = generic_calibrate_decr,
.progress   = udbg_progress,
 };
diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c 
b/arch/powerpc/platforms/85xx/corenet_generic.c
index 3a6a84f..297379b 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -225,7 +225,6 @@ define_machine(corenet_generic) {
 

[PATCH 2/3] powerpc: Call chained reset handlers during reset

2016-07-25 Thread Andrey Smirnov
Call out to all restart handlers that were added via
register_restart_handler() API when restarting the machine.

Signed-off-by: Andrey Smirnov 
---
 arch/powerpc/kernel/setup-common.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 5cd3283..205d073 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -145,6 +145,10 @@ void machine_restart(char *cmd)
ppc_md.restart(cmd);
 
smp_send_stop();
+
+   do_kernel_restart(cmd);
+   mdelay(1000);
+
machine_hang();
 }
 
-- 
2.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/3] powerpc: Factor out common code in setup-common.c

2016-07-25 Thread Andrey Smirnov
Factor out a small bit of common code in machine_restart(),
machine_power_off() and machine_halt().

Signed-off-by: Andrey Smirnov 
---
 arch/powerpc/kernel/setup-common.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 714b4ba..5cd3283 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -130,15 +130,22 @@ void machine_shutdown(void)
ppc_md.machine_shutdown();
 }
 
+static void machine_hang(void)
+{
+   pr_emerg("System Halted, OK to turn off power\n");
+   local_irq_disable();
+   while (1)
+   ;
+}
+
 void machine_restart(char *cmd)
 {
machine_shutdown();
if (ppc_md.restart)
ppc_md.restart(cmd);
+
smp_send_stop();
-   printk(KERN_EMERG "System Halted, OK to turn off power\n");
-   local_irq_disable();
-   while (1) ;
+   machine_hang();
 }
 
 void machine_power_off(void)
@@ -146,10 +153,9 @@ void machine_power_off(void)
machine_shutdown();
if (pm_power_off)
pm_power_off();
+
smp_send_stop();
-   printk(KERN_EMERG "System Halted, OK to turn off power\n");
-   local_irq_disable();
-   while (1) ;
+   machine_hang();
 }
 /* Used by the G5 thermal driver */
 EXPORT_SYMBOL_GPL(machine_power_off);
@@ -162,10 +168,9 @@ void machine_halt(void)
machine_shutdown();
if (ppc_md.halt)
ppc_md.halt();
+
smp_send_stop();
-   printk(KERN_EMERG "System Halted, OK to turn off power\n");
-   local_irq_disable();
-   while (1) ;
+   machine_hang();
 }
 
 
-- 
2.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2 1/2] tty/hvc: Use IRQF_SHARED for OPAL hvc consoles

2016-07-25 Thread Greg KH
On Tue, Jul 26, 2016 at 02:11:11PM +1000, Michael Ellerman wrote:
> Quoting Michael Ellerman (2016-07-11 16:29:20)
> > Samuel Mendoza-Jonas  writes:
> > 
> > > Commit 2def86a7200c
> > > ("hvc: Convert to using interrupts instead of opal events")
> > > enabled the use of interrupts in the hvc_driver for OPAL platforms.
> > > However on machines with more than one hvc console, any console after
> > > the first will fail to register an interrupt handler in
> > > notifier_add_irq() since all consoles share the same IRQ number but do
> > > not set the IRQF_SHARED flag:
> > >
> > > [   51.179907] genirq: Flags mismatch irq 31.  (hvc_console) vs.
> > >  (hvc_console)
> > > [   51.180010] hvc_open: request_irq failed with rc -16.
> > >
> > > This error propagates up to hvc_open() and the console is closed, but
> > > OPAL will still generate interrupts that are not handled, leading to
> > > rcu_sched stall warnings.
> > >
> > > Set IRQF_SHARED when calling request_irq, allowing additional consoles
> > > to start properly. This is only set for consoles handled by
> > > hvc_opal_probe(), leaving other types unaffected.
> > >
> > > Signed-off-by: Samuel Mendoza-Jonas 
> > > Cc:  # 4.1.x-
> > > ---
> > >  drivers/tty/hvc/hvc_console.h | 1 +
> > >  drivers/tty/hvc/hvc_irq.c | 7 +--
> > >  drivers/tty/hvc/hvc_opal.c| 3 +++
> > >  3 files changed, 9 insertions(+), 2 deletions(-)
> > 
> > Acked-by: Michael Ellerman 
> > 
> > Greg are you happy to take these two?
> 
> Hi Greg,
> 
> I don't see this series anywhere, do you mind if I take them via the
> powerpc tree for 4.8 ? Or do you want to pick them up.

You can take them, I'm not touching patches now until 4.8-rc1 is out,
sorry.

thanks,

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2 1/2] tty/hvc: Use IRQF_SHARED for OPAL hvc consoles

2016-07-25 Thread Michael Ellerman
Quoting Michael Ellerman (2016-07-11 16:29:20)
> Samuel Mendoza-Jonas  writes:
> 
> > Commit 2def86a7200c
> > ("hvc: Convert to using interrupts instead of opal events")
> > enabled the use of interrupts in the hvc_driver for OPAL platforms.
> > However on machines with more than one hvc console, any console after
> > the first will fail to register an interrupt handler in
> > notifier_add_irq() since all consoles share the same IRQ number but do
> > not set the IRQF_SHARED flag:
> >
> > [   51.179907] genirq: Flags mismatch irq 31.  (hvc_console) vs.
> >  (hvc_console)
> > [   51.180010] hvc_open: request_irq failed with rc -16.
> >
> > This error propagates up to hvc_open() and the console is closed, but
> > OPAL will still generate interrupts that are not handled, leading to
> > rcu_sched stall warnings.
> >
> > Set IRQF_SHARED when calling request_irq, allowing additional consoles
> > to start properly. This is only set for consoles handled by
> > hvc_opal_probe(), leaving other types unaffected.
> >
> > Signed-off-by: Samuel Mendoza-Jonas 
> > Cc:  # 4.1.x-
> > ---
> >  drivers/tty/hvc/hvc_console.h | 1 +
> >  drivers/tty/hvc/hvc_irq.c | 7 +--
> >  drivers/tty/hvc/hvc_opal.c| 3 +++
> >  3 files changed, 9 insertions(+), 2 deletions(-)
> 
> Acked-by: Michael Ellerman 
> 
> Greg are you happy to take these two?

Hi Greg,

I don't see this series anywhere, do you mind if I take them via the
powerpc tree for 4.8 ? Or do you want to pick them up.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 1/2] powerpc/mm: Fix build break when PPC_NATIVE=n

2016-07-25 Thread Stephen Rothwell
Hi Michael,

On Tue, 26 Jul 2016 13:38:37 +1000 Michael Ellerman  wrote:
>
> The recent commit to rework the hash MMU setup broke the build when
> CONFIG_PPC_NATIVE=n. Fix it by adding an IS_ENABLED() check before
> calling hpte_init_native().
> 
> Removing the else clause opens the possibility that we don't set any
> ops, which would probably lead to a strange crash later. So add a check
> that we correctly initialised at least one member of the struct.
> 
> Fixes: 166dd7d3fbf2 ("powerpc/64: Move MMU backend selection out of platform 
> code")
> Reported-by: Stephen Rothwell 
> Signed-off-by: Michael Ellerman 

Acked-by: Stephen Rothwell 
-- 
Cheers,
Stephen Rothwell
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 2/2] powerpc/mm: Rename hpte_init_lpar() and move the fallback to a header

2016-07-25 Thread Stephen Rothwell
Hi Michael,

On Tue, 26 Jul 2016 13:38:38 +1000 Michael Ellerman  wrote:
>
> hpte_init_lpar() is part of the pseries platform, so name it as such.
> 
> Move the fallback implementation for when PSERIES=n into the header,
> dropping the weak implementation. The panic() is now handled by the
> calling code.

Of course, this could have been handled the same way as the native one.

>   else if (firmware_has_feature(FW_FEATURE_LPAR))
else if (IS_ENABLED(CONFIG_PPC_PSERIES) && 
firmware_has_feature(FW_FEATURE_LPAR))
> - hpte_init_lpar();
> + hpte_init_pseries();

and no need to modify the header file.
-- 
Cheers,
Stephen Rothwell
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 2/2] powerpc/mm: Rename hpte_init_lpar() and move the fallback to a header

2016-07-25 Thread Michael Ellerman
hpte_init_lpar() is part of the pseries platform, so name it as such.

Move the fallback implementation for when PSERIES=n into the header,
dropping the weak implementation. The panic() is now handled by the
calling code.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h | 7 ++-
 arch/powerpc/mm/hash_utils_64.c   | 7 +--
 arch/powerpc/platforms/pseries/lpar.c | 2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index b0f4dffe12ae..450b017fdc19 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -391,8 +391,13 @@ int htab_remove_mapping(unsigned long vstart, unsigned 
long vend,
 extern void add_gpage(u64 addr, u64 page_size, unsigned long number_of_pages);
 extern void demote_segment_4k(struct mm_struct *mm, unsigned long addr);
 
+#ifdef CONFIG_PPC_PSERIES
+void hpte_init_pseries(void);
+#else
+static inline void hpte_init_pseries(void) { }
+#endif
+
 extern void hpte_init_native(void);
-extern void hpte_init_lpar(void);
 extern void hpte_init_beat(void);
 extern void hpte_init_beat_v3(void);
 
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 381b5894cc99..1ff11c1bb182 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -885,11 +885,6 @@ static void __init htab_initialize(void)
 #undef KB
 #undef MB
 
-void __init __weak hpte_init_lpar(void)
-{
-   panic("FW_FEATURE_LPAR set but no LPAR support compiled\n");
-}
-
 void __init hash__early_init_mmu(void)
 {
/*
@@ -930,7 +925,7 @@ void __init hash__early_init_mmu(void)
if (firmware_has_feature(FW_FEATURE_PS3_LV1))
ps3_early_mm_init();
else if (firmware_has_feature(FW_FEATURE_LPAR))
-   hpte_init_lpar();
+   hpte_init_pseries();
else if IS_ENABLED(CONFIG_PPC_NATIVE)
hpte_init_native();
 
diff --git a/arch/powerpc/platforms/pseries/lpar.c 
b/arch/powerpc/platforms/pseries/lpar.c
index 0e91388d0af9..86707e67843f 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -589,7 +589,7 @@ static int __init disable_bulk_remove(char *str)
 
 __setup("bulk_remove=", disable_bulk_remove);
 
-void __init hpte_init_lpar(void)
+void __init hpte_init_pseries(void)
 {
mmu_hash_ops.hpte_invalidate = pSeries_lpar_hpte_invalidate;
mmu_hash_ops.hpte_updatepp   = pSeries_lpar_hpte_updatepp;
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 1/2] powerpc/mm: Fix build break when PPC_NATIVE=n

2016-07-25 Thread Michael Ellerman
The recent commit to rework the hash MMU setup broke the build when
CONFIG_PPC_NATIVE=n. Fix it by adding an IS_ENABLED() check before
calling hpte_init_native().

Removing the else clause opens the possibility that we don't set any
ops, which would probably lead to a strange crash later. So add a check
that we correctly initialised at least one member of the struct.

Fixes: 166dd7d3fbf2 ("powerpc/64: Move MMU backend selection out of platform 
code")
Reported-by: Stephen Rothwell 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/mm/hash_utils_64.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 341632471b9d..381b5894cc99 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -931,9 +931,12 @@ void __init hash__early_init_mmu(void)
ps3_early_mm_init();
else if (firmware_has_feature(FW_FEATURE_LPAR))
hpte_init_lpar();
-   else
+   else if IS_ENABLED(CONFIG_PPC_NATIVE)
hpte_init_native();
 
+   if (!mmu_hash_ops.hpte_insert)
+   panic("hash__early_init_mmu: No MMU hash ops defined!\n");
+
/* Initialize the MMU Hash table and create the linear mapping
 * of memory. Has to be done before SLB initialization as this is
 * currently where the page size encoding is obtained.
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH v3 02/11] mm: Hardened usercopy

2016-07-25 Thread Michael Ellerman
David Laight  writes:

> From: Josh Poimboeuf
>> Sent: 22 July 2016 18:46
>> >
>> > e.g. then if the pointer was in the thread_info, the second test would
>> > fail, triggering the protection.
>> 
>> FWIW, this won't work right on x86 after Andy's
>> CONFIG_THREAD_INFO_IN_TASK patches get merged.
>
> What ends up in the 'thread_info' area?

It depends on the arch.

> If it contains the fp save area then programs like gdb may end up requesting
> copy_in/out directly from that area.

On the arches I've seen thread_info doesn't usually contain register save areas,
but if it did then it would be up to the arch helper to allow that copy to go
through.

However given thread_info generally contains lots of low level flags that would
be a good target for an attacker, the best way to cope with ptrace wanting to
copy to/from it would be to use a temporary, and prohibit copying directly
to/from thread_info - IMHO.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 02/11] mm: Hardened usercopy

2016-07-25 Thread Michael Ellerman
Josh Poimboeuf  writes:

> On Thu, Jul 21, 2016 at 11:34:25AM -0700, Kees Cook wrote:
>> On Wed, Jul 20, 2016 at 11:52 PM, Michael Ellerman  
>> wrote:
>> > Kees Cook  writes:
>> >
>> >> diff --git a/mm/usercopy.c b/mm/usercopy.c
>> >> new file mode 100644
>> >> index ..e4bf4e7ccdf6
>> >> --- /dev/null
>> >> +++ b/mm/usercopy.c
>> >> @@ -0,0 +1,234 @@
>> > ...
>> >> +
>> >> +/*
>> >> + * Checks if a given pointer and length is contained by the current
>> >> + * stack frame (if possible).
>> >> + *
>> >> + *   0: not at all on the stack
>> >> + *   1: fully within a valid stack frame
>> >> + *   2: fully on the stack (when can't do frame-checking)
>> >> + *   -1: error condition (invalid stack position or bad stack frame)
>> >> + */
>> >> +static noinline int check_stack_object(const void *obj, unsigned long 
>> >> len)
>> >> +{
>> >> + const void * const stack = task_stack_page(current);
>> >> + const void * const stackend = stack + THREAD_SIZE;
>> >
>> > That allows access to the entire stack, including the struct thread_info,
>> > is that what we want - it seems dangerous? Or did I miss a check
>> > somewhere else?
>> 
>> That seems like a nice improvement to make, yeah.
>> 
>> > We have end_of_stack() which computes the end of the stack taking
>> > thread_info into account (end being the opposite of your end above).
>> 
>> Amusingly, the object_is_on_stack() check in sched.h doesn't take
>> thread_info into account either. :P Regardless, I think using
>> end_of_stack() may not be best. To tighten the check, I think we could
>> add this after checking that the object is on the stack:
>> 
>> #ifdef CONFIG_STACK_GROWSUP
>> stackend -= sizeof(struct thread_info);
>> #else
>> stack += sizeof(struct thread_info);
>> #endif
>> 
>> e.g. then if the pointer was in the thread_info, the second test would
>> fail, triggering the protection.
>
> FWIW, this won't work right on x86 after Andy's
> CONFIG_THREAD_INFO_IN_TASK patches get merged.

Yeah. I wonder if it's better for the arch helper to just take the obj and len,
and work out it's own bounds for the stack using current and whatever makes
sense on that arch.

It would avoid too much ifdefery in the generic code, and also avoid any
confusion about whether stackend is the high or low address.

eg. on powerpc we could do:

int noinline arch_within_stack_frames(const void *obj, unsigned long len)
{
void *stack_low  = end_of_stack(current);
void *stack_high = task_stack_page(current) + THREAD_SIZE;


Whereas arches with STACK_GROWSUP=y could do roughly the reverse, and x86 can do
whatever it needs to depending on whether the thread_info is on or off stack.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powernv/pci: Add PHB register dump debugfs handle

2016-07-25 Thread Michael Ellerman
Quoting Russell Currey (2016-07-22 15:23:36)
> On EEH events the kernel will print a dump of relevant registers.
> If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform
> doesn't have EEH support, etc) this information isn't readily available.
> 
> Add a new debugfs handler to trigger a PHB register dump, so that this
> information can be made available on demand.

This is a bit weird.

It's a debugfs file, but when you read from it you get nothing (I think,
you have no read() defined).

When you write to it, regardless of what you write, the kernel spits
some stuff out to dmesg and throws away whatever you wrote.

Ideally pnv_pci_dump_phb_diag_data() would write its output to a buffer,
which we could then either send to dmesg, or give to debugfs. But that
might be more work than we want to do for this.

If we just want a trigger file, then I think it'd be preferable to just
use a simple attribute, with a set and no show, eg. something like:

static int foo_set(void *data, u64 val)
{
if (val != 1)
return -EINVAL;

...

return 0;
}

DEFINE_SIMPLE_ATTRIBUTE(fops_foo, NULL, foo_set, "%llu\n");

That requires that you write "1" to the file to trigger the reg dump.


> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 891fc4a..ada2f3c 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -3036,6 +3068,9 @@ static void pnv_pci_ioda_create_dbgfs(void)
> if (!phb->dbgfs)
> pr_warning("%s: Error on creating debugfs on 
> PHB#%x\n",
> __func__, hose->global_number);
> +
> +   debugfs_create_file("regdump", 0200, phb->dbgfs, hose,
> +   _pci_debug_ops);
> }

You shouldn't be trying to create the file if the directory create failed. So
the check for (!phb->dbgfs) should probably print and then continue.

And a better name would be "dump-regs", because it indicates that the file does
something, rather than is something.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] include: mman: Use bool instead of int for the return value of arch_validate_prot

2016-07-25 Thread Michael Ellerman
Andrew Morton  writes:
> On Mon, 25 Jul 2016 15:10:06 +1000 Michael Ellerman  
> wrote:
>> cheng...@emindsoft.com.cn writes:
>> > From: Chen Gang 
>> >
>> > For pure bool function's return value, bool is a little better more or
>> > less than int.
>> >
>> > Signed-off-by: Chen Gang 
>> 
>> LGTM.
>> 
>> Acked-by: Michael Ellerman 
>> 
>> Andrew do you want to take this or should I?
>
> I grabbed it, thanks.

Thanks.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powernv/pci: Add PHB register dump debugfs handle

2016-07-25 Thread Michael Ellerman
Tyrel Datwyler  writes:

> On 07/21/2016 11:36 PM, Gavin Shan wrote:
>> On Fri, Jul 22, 2016 at 03:23:36PM +1000, Russell Currey wrote:
>>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
>>> b/arch/powerpc/platforms/powernv/pci-ioda.c
>>> index 891fc4a..ada2f3c 100644
>>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>> @@ -3018,6 +3018,38 @@ static void pnv_ioda_setup_pe_seg(struct pnv_ioda_pe 
>>> *pe)
>>> }
>>> }
>>>
>>> +#ifdef CONFIG_DEBUG_FS
>>> +static ssize_t pnv_pci_debug_write(struct file *filp,
>>> +  const char __user *user_buf,
>>> +  size_t count, loff_t *ppos)
>>> +{
>>> +   struct pci_controller *hose = filp->private_data;
>>> +   struct pnv_phb *phb;
>>> +   int ret = 0;
>> 
>> Needn't initialize @ret in advance. The code might be simpler, but it's
>> only a personal preference:
>
> I believe its actually preferred that it not be initialized in advance
> so that the tooling can warn you about conditional code paths where you
> may have forgotten to set a value.

Yeah that's right, it's preferable not to initialise it.

It helps for complex if/else/switch cases, where you might accidentally
have a path where you return without giving ret the right value.

The other case is when someone modifies your code. For example if you
have:

int ret;

if (foo)
ret = do_foo();
else
ret = 1;

return ret;

And then you add a case to the if:

if (foo)
ret = do_foo();
else if (bar)
do_bar();
else
ret = 1;

The compiler will warn you that in the bar case you forget to initialise
ret. Whereas if you initialised ret at the start then the compiler can't
help you.

There are times when it's cleaner to initialise the value at the start,
eg. if you have many error cases and only one success case. But that
should be a deliberate choice.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 12/12] mm: SLUB hardened usercopy support

2016-07-25 Thread Laura Abbott

On 07/25/2016 01:45 PM, Kees Cook wrote:

On Mon, Jul 25, 2016 at 12:16 PM, Laura Abbott  wrote:

On 07/20/2016 01:27 PM, Kees Cook wrote:


Under CONFIG_HARDENED_USERCOPY, this adds object size checking to the
SLUB allocator to catch any copies that may span objects. Includes a
redzone handling fix discovered by Michael Ellerman.

Based on code from PaX and grsecurity.

Signed-off-by: Kees Cook 
Tested-by: Michael Ellerman 
---
 init/Kconfig |  1 +
 mm/slub.c| 36 
 2 files changed, 37 insertions(+)

diff --git a/init/Kconfig b/init/Kconfig
index 798c2020ee7c..1c4711819dfd 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1765,6 +1765,7 @@ config SLAB

 config SLUB
bool "SLUB (Unqueued Allocator)"
+   select HAVE_HARDENED_USERCOPY_ALLOCATOR
help
   SLUB is a slab allocator that minimizes cache line usage
   instead of managing queues of cached objects (SLAB approach).
diff --git a/mm/slub.c b/mm/slub.c
index 825ff4505336..7dee3d9a5843 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3614,6 +3614,42 @@ void *__kmalloc_node(size_t size, gfp_t flags, int
node)
 EXPORT_SYMBOL(__kmalloc_node);
 #endif

+#ifdef CONFIG_HARDENED_USERCOPY
+/*
+ * Rejects objects that are incorrectly sized.
+ *
+ * Returns NULL if check passes, otherwise const char * to name of cache
+ * to indicate an error.
+ */
+const char *__check_heap_object(const void *ptr, unsigned long n,
+   struct page *page)
+{
+   struct kmem_cache *s;
+   unsigned long offset;
+   size_t object_size;
+
+   /* Find object and usable object size. */
+   s = page->slab_cache;
+   object_size = slab_ksize(s);
+
+   /* Find offset within object. */
+   offset = (ptr - page_address(page)) % s->size;
+
+   /* Adjust for redzone and reject if within the redzone. */
+   if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) {
+   if (offset < s->red_left_pad)
+   return s->name;
+   offset -= s->red_left_pad;
+   }
+
+   /* Allow address range falling entirely within object size. */
+   if (offset <= object_size && n <= object_size - offset)
+   return NULL;
+
+   return s->name;
+}
+#endif /* CONFIG_HARDENED_USERCOPY */
+



I compared this against what check_valid_pointer does for SLUB_DEBUG
checking. I was hoping we could utilize that function to avoid
duplication but a) __check_heap_object needs to allow accesses anywhere
in the object, not just the beginning b) accessing page->objects
is racy without the addition of locking in SLUB_DEBUG.

Still, the ptr < page_address(page) check from __check_heap_object would
be good to add to avoid generating garbage large offsets and trying to
infer C math.

diff --git a/mm/slub.c b/mm/slub.c
index 7dee3d9..5370e4f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3632,6 +3632,9 @@ const char *__check_heap_object(const void *ptr,
unsigned long n,
s = page->slab_cache;
object_size = slab_ksize(s);
 +   if (ptr < page_address(page))
+   return s->name;
+
/* Find offset within object. */
offset = (ptr - page_address(page)) % s->size;

With that, you can add

Reviwed-by: Laura Abbott 


Cool, I'll add that.

Should I add your reviewed-by for this patch only or for the whole series?

Thanks!

-Kees



Just this patch for now, I'm working through a couple of others




 static size_t __ksize(const void *object)
 {
struct page *page;



Thanks,
Laura






___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v3] UCC_GETH/UCC_FAST: Use IS_ERR_VALUE_U32 API to avoid IS_ERR_VALUE abuses.

2016-07-25 Thread David Miller
From: Arvind Yadav 
Date: Sat, 23 Jul 2016 23:35:51 +0530

> However, anything that passes an 'unsigned short' or 'unsigned int'
> argument into IS_ERR_VALUE() is guaranteed to be broken, as are
> 8-bit integers and types that are wider than 'unsigned long'.
 ...
> Passing value in IS_ERR_VALUE() is wrong, as they pass an
> 'unsigned int' into a function that takes an 'unsigned long'
> argument.This happens to work because the type is sign-extended
> on 64-bit architectures before it gets converted into an
> unsigned type.

This commit log message is a complete mess, you're saying exactly
the same thing over and over again.

Also your Subject line is not formatted correctly, do not list
the subsystem prefix in ALL CAPS.  Just plain "ucc_geth/ucc_fast: "
would be fine.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powernv/pci: Add PHB register dump debugfs handle

2016-07-25 Thread Russell Currey
On Fri, 2016-07-22 at 16:36 +1000, Gavin Shan wrote:
> On Fri, Jul 22, 2016 at 03:23:36PM +1000, Russell Currey wrote:
> > 
> > On EEH events the kernel will print a dump of relevant registers.
> > If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform
> > doesn't have EEH support, etc) this information isn't readily available.
> > 
> > Add a new debugfs handler to trigger a PHB register dump, so that this
> > information can be made available on demand.
> > 
> > Signed-off-by: Russell Currey 
> 
> Reviewed-by: Gavin Shan 

Hi Gavin, thanks for the review.

> 
> > 
> > ---
> > arch/powerpc/platforms/powernv/pci-ioda.c | 35
> > +++
> > 1 file changed, 35 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c
> > b/arch/powerpc/platforms/powernv/pci-ioda.c
> > index 891fc4a..ada2f3c 100644
> > --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> > @@ -3018,6 +3018,38 @@ static void pnv_ioda_setup_pe_seg(struct pnv_ioda_pe
> > *pe)
> > }
> > }
> > 
> > +#ifdef CONFIG_DEBUG_FS
> > +static ssize_t pnv_pci_debug_write(struct file *filp,
> > +      const char __user *user_buf,
> > +      size_t count, loff_t *ppos)
> > +{
> > +   struct pci_controller *hose = filp->private_data;
> > +   struct pnv_phb *phb;
> > +   int ret = 0;
> 
> Needn't initialize @ret in advance. The code might be simpler, but it's
> only a personal preference:
> 
>   struct pci_controller *hose = filp->private_data;
>   struct pnv_phb *phb = hose ? hose->private_data : NULL;
> 
>   if (!phb)
>   return -ENODEV;
> 
> > 
> > +
> > +   if (!hose)
> > +   return -EFAULT;
> > +
> > +   phb = hose->private_data;
> > +   if (!phb)
> > +   return -EFAULT;
> > +
> > +   ret = opal_pci_get_phb_diag_data2(phb->opal_id, phb->diag.blob,
> > +     PNV_PCI_DIAG_BUF_SIZE);
> > +
> > +   if (!ret)
> > +   pnv_pci_dump_phb_diag_data(phb->hose, phb->diag.blob);
> > +
> > +   return ret < 0 ? ret : count;
> 
> return ret == OPAL_SUCCESS ? count : -EIO;

Yeah, that's much better.

> 
> > 
> > +}
> > +
> > +static const struct file_operations pnv_pci_debug_ops = {
> > +   .open   = simple_open,
> > +   .llseek = no_llseek,
> > +   .write  = pnv_pci_debug_write,
> 
> It might be reasonable to dump the diag-data on read if it is trying
> to do it on write.

I'm not sure about this one.  I went with write since (at least, in my mind)
writing to a file feels like triggering an action, although we're not actually
reading any input.  It also means that it works the same way as the other PHB
debugfs entries (i.e. errinjct).

I could rework it into a read that said something like "PHB#%x diag data dumped,
check the kernel log", what do you think?

> 
> > 
> > +};
> > +#endif /* CONFIG_DEBUG_FS */
> > +
> > static void pnv_pci_ioda_create_dbgfs(void)
> > {
> > #ifdef CONFIG_DEBUG_FS
> > @@ -3036,6 +3068,9 @@ static void pnv_pci_ioda_create_dbgfs(void)
> > if (!phb->dbgfs)
> > pr_warning("%s: Error on creating debugfs on PHB#%x\n",
> > __func__, hose->global_number);
> > +
> > +   debugfs_create_file("regdump", 0200, phb->dbgfs, hose,
> > +   _pci_debug_ops);
> 
> "diag-data" might be indicating or a better one you can name :)
> 
> Thanks,
> Gavin
> 
> > 
> > }
> > #endif /* CONFIG_DEBUG_FS */
> > }
> > -- 
> > 2.9.0
> > 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 7/8] powerpc: Check arch.vec earlier during boot for memory features

2016-07-25 Thread kbuild test robot
Hi,

[auto build test ERROR on pci/next]
[cannot apply to powerpc/next next-20160725]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Michael-Bringmann/powerpc-devtree-Add-support-for-2-new-DRC-properties/20160726-063623
base:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
config: powerpc-allnoconfig (attached as .config)
compiler: powerpc-linux-gnu-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/kernel/built-in.o: In function `early_init_devtree':
>> (.init.text+0x1072): undefined reference to `pseries_probe_fw_features'
   arch/powerpc/kernel/built-in.o: In function `early_init_devtree':
   (.init.text+0x107a): undefined reference to `pseries_probe_fw_features'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 12/12] mm: SLUB hardened usercopy support

2016-07-25 Thread Rik van Riel
On Mon, 2016-07-25 at 16:29 -0700, Laura Abbott wrote:
> On 07/25/2016 02:42 PM, Rik van Riel wrote:
> > On Mon, 2016-07-25 at 12:16 -0700, Laura Abbott wrote:
> > > On 07/20/2016 01:27 PM, Kees Cook wrote:
> > > > Under CONFIG_HARDENED_USERCOPY, this adds object size checking
> > > > to
> > > > the
> > > > SLUB allocator to catch any copies that may span objects.
> > > > Includes
> > > > a
> > > > redzone handling fix discovered by Michael Ellerman.
> > > > 
> > > > Based on code from PaX and grsecurity.
> > > > 
> > > > Signed-off-by: Kees Cook 
> > > > Tested-by: Michael Ellerman 
> > > > ---
> > > >  init/Kconfig |  1 +
> > > >  mm/slub.c| 36 
> > > >  2 files changed, 37 insertions(+)
> > > > 
> > > > diff --git a/init/Kconfig b/init/Kconfig
> > > > index 798c2020ee7c..1c4711819dfd 100644
> > > > --- a/init/Kconfig
> > > > +++ b/init/Kconfig
> > > > @@ -1765,6 +1765,7 @@ config SLAB
> > > > 
> > > >  config SLUB
> > > >     bool "SLUB (Unqueued Allocator)"
> > > > +   select HAVE_HARDENED_USERCOPY_ALLOCATOR
> > > >     help
> > > >        SLUB is a slab allocator that minimizes cache line
> > > > usage
> > > >        instead of managing queues of cached objects (SLAB
> > > > approach).
> > > > diff --git a/mm/slub.c b/mm/slub.c
> > > > index 825ff4505336..7dee3d9a5843 100644
> > > > --- a/mm/slub.c
> > > > +++ b/mm/slub.c
> > > > @@ -3614,6 +3614,42 @@ void *__kmalloc_node(size_t size, gfp_t
> > > > flags, int node)
> > > >  EXPORT_SYMBOL(__kmalloc_node);
> > > >  #endif
> > > > 
> > > > +#ifdef CONFIG_HARDENED_USERCOPY
> > > > +/*
> > > > + * Rejects objects that are incorrectly sized.
> > > > + *
> > > > + * Returns NULL if check passes, otherwise const char * to
> > > > name of
> > > > cache
> > > > + * to indicate an error.
> > > > + */
> > > > +const char *__check_heap_object(const void *ptr, unsigned long
> > > > n,
> > > > +   struct page *page)
> > > > +{
> > > > +   struct kmem_cache *s;
> > > > +   unsigned long offset;
> > > > +   size_t object_size;
> > > > +
> > > > +   /* Find object and usable object size. */
> > > > +   s = page->slab_cache;
> > > > +   object_size = slab_ksize(s);
> > > > +
> > > > +   /* Find offset within object. */
> > > > +   offset = (ptr - page_address(page)) % s->size;
> > > > +
> > > > +   /* Adjust for redzone and reject if within the
> > > > redzone. */
> > > > +   if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) {
> > > > +   if (offset < s->red_left_pad)
> > > > +   return s->name;
> > > > +   offset -= s->red_left_pad;
> > > > +   }
> > > > +
> > > > +   /* Allow address range falling entirely within object
> > > > size. */
> > > > +   if (offset <= object_size && n <= object_size -
> > > > offset)
> > > > +   return NULL;
> > > > +
> > > > +   return s->name;
> > > > +}
> > > > +#endif /* CONFIG_HARDENED_USERCOPY */
> > > > +
> > > 
> > > I compared this against what check_valid_pointer does for
> > > SLUB_DEBUG
> > > checking. I was hoping we could utilize that function to avoid
> > > duplication but a) __check_heap_object needs to allow accesses
> > > anywhere
> > > in the object, not just the beginning b) accessing page->objects
> > > is racy without the addition of locking in SLUB_DEBUG.
> > > 
> > > Still, the ptr < page_address(page) check from
> > > __check_heap_object
> > > would
> > > be good to add to avoid generating garbage large offsets and
> > > trying
> > > to
> > > infer C math.
> > > 
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index 7dee3d9..5370e4f 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -3632,6 +3632,9 @@ const char *__check_heap_object(const void
> > > *ptr, unsigned long n,
> > >  s = page->slab_cache;
> > >  object_size = slab_ksize(s);
> > > 
> > > +   if (ptr < page_address(page))
> > > +   return s->name;
> > > +
> > >  /* Find offset within object. */
> > >  offset = (ptr - page_address(page)) % s->size;
> > > 
> > 
> > I don't get it, isn't that already guaranteed because we
> > look for the page that ptr is in, before __check_heap_object
> > is called?
> > 
> > Specifically, in patch 3/12:
> > 
> > +   page = virt_to_head_page(ptr);
> > +
> > +   /* Check slab allocator for flags and size. */
> > +   if (PageSlab(page))
> > +   return __check_heap_object(ptr, n, page);
> > 
> > How can that generate a ptr that is not inside the page?
> > 
> > What am I overlooking?  And, should it be in the changelog or
> > a comment? :)
> > 
> 
> 
> I ran into the subtraction issue when the vmalloc detection wasn't
> working on ARM64, somehow virt_to_head_page turned into a page
> that happened to have PageSlab set. I agree if everything is working
> properly this is redundant but given the type of 

Re: [PATCH v2] include: mman: Use bool instead of int for the return value of arch_validate_prot

2016-07-25 Thread Andrew Morton
On Mon, 25 Jul 2016 15:10:06 +1000 Michael Ellerman  wrote:

> cheng...@emindsoft.com.cn writes:
> 
> > From: Chen Gang 
> >
> > For pure bool function's return value, bool is a little better more or
> > less than int.
> >
> > Signed-off-by: Chen Gang 
> > ---
> >  arch/powerpc/include/asm/mman.h | 8 
> >  include/linux/mman.h| 2 +-
> >  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> LGTM.
> 
> Acked-by: Michael Ellerman 
> 
> Andrew do you want to take this or should I?

I grabbed it, thanks.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/8] powerpc/firmware: Add definitions for new firmware features.

2016-07-25 Thread Tyrel Datwyler
On 07/25/2016 03:21 PM, Michael Bringmann wrote:
> Firmware Features: Define new bit flags representing the presence of
> new device tree properties "ibm,drc-info", and "ibm,dynamic-memory-v2".
> These flags are used to tell the front end processor when the Linux
> kernel supports the new properties, and by the front end processor to
> tell the Linux kernel that the new properties are present in the devie
> tree.
> 
> Signed-off-by: Michael Bringmann 
> ---
> diff --git a/arch/powerpc/include/asm/firmware.h 
> b/arch/powerpc/include/asm/firmware.h
> index b062924..a9d66d5 100644
> --- a/arch/powerpc/include/asm/firmware.h
> +++ b/arch/powerpc/include/asm/firmware.h
> @@ -51,6 +51,8 @@
>  #define FW_FEATURE_BEST_ENERGY   ASM_CONST(0x8000)
>  #define FW_FEATURE_TYPE1_AFFINITY ASM_CONST(0x0001)
>  #define FW_FEATURE_PRRN  ASM_CONST(0x0002)
> +#define FW_FEATURE_RPS_DM2   ASM_CONST(0x0004)
> +#define FW_FEATURE_RPS_DRC_INFO  ASM_CONST(0x0008)

I can't say that these names are my favorite. Especially _RPS_DM2. I
haven't actually seen the PAPR updates that define these things, but I
would hope that these had more self explanatory names. I'm not really
sure what _RPS_ means. Like I said I haven't seen the PAPR update so
maybe that is a new acronym defined there.

-Tyrel

>  
>  #ifndef __ASSEMBLY__
>  
> @@ -66,7 +68,8 @@ enum {
>   FW_FEATURE_MULTITCE | FW_FEATURE_SPLPAR | FW_FEATURE_LPAR |
>   FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO |
>   FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
> - FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN,
> + FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
> + FW_FEATURE_RPS_DM2 | FW_FEATURE_RPS_DRC_INFO,
>   FW_FEATURE_PSERIES_ALWAYS = 0,
>   FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
>   FW_FEATURE_POWERNV_ALWAYS = 0,
> diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
> index 7f436ba..b9a1534 100644
> --- a/arch/powerpc/include/asm/prom.h
> +++ b/arch/powerpc/include/asm/prom.h
> @@ -155,6 +203,8 @@ struct of_drconf_cell {
>  #define OV5_PFO_HW_842   0x0E40  /* PFO Compression Accelerator 
> */
>  #define OV5_PFO_HW_ENCR  0x0E20  /* PFO Encryption Accelerator */
>  #define OV5_SUB_PROCESSORS   0x0F01  /* 1,2,or 4 Sub-Processors supported */
> +#define OV5_RPS_DM2  0x1680  /* Redef Prop Structures: dyn-mem-v2 */
> +#define OV5_RPS_DRC_INFO 0x1640  /* Redef Prop Structures: drc-info   */
>  
>  /* Option Vector 6: IBM PAPR hints */
>  #define OV6_LINUX0x02/* Linux is our OS */
> diff --git a/arch/powerpc/platforms/pseries/firmware.c 
> b/arch/powerpc/platforms/pseries/firmware.c
> index 8c80588..00243ee 100644
> --- a/arch/powerpc/platforms/pseries/firmware.c
> +++ b/arch/powerpc/platforms/pseries/firmware.c
> @@ -111,6 +111,8 @@ static __initdata struct vec5_fw_feature
>  vec5_fw_features_table[] = {
>   {FW_FEATURE_TYPE1_AFFINITY, OV5_TYPE1_AFFINITY},
>   {FW_FEATURE_PRRN,   OV5_PRRN},
> + {FW_FEATURE_RPS_DM2,OV5_RPS_DM2},
> + {FW_FEATURE_RPS_DRC_INFO,   OV5_RPS_DRC_INFO},
>  };
>  
>  void __init fw_vec5_feature_init(const char *vec5, unsigned long len)
> 
> ___
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/8] powerpc/memory: Parse new memory property to register blocks.

2016-07-25 Thread Tyrel Datwyler
On 07/25/2016 03:21 PM, Michael Bringmann wrote:
> powerpc/memory: Add parallel routines to parse the new property
> "ibm,dynamic-memory-v2" property when it is present, and then to
> register the relevant memory blocks with the operating system.
> This property format is intended to provide a more compact
> representation of memory when communicating with the front end
> processor, especially when describing vast amounts of RAM.
> 
> Signed-off-by: Michael Bringmann 
> ---
> diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
> index 7f436ba..b9a1534 100644
> --- a/arch/powerpc/include/asm/prom.h
> +++ b/arch/powerpc/include/asm/prom.h
> @@ -69,6 +69,8 @@ struct boot_param_header {
>   * OF address retreival & translation
>   */
>  
> +extern int n_mem_addr_cells;
> +
>  /* Parse the ibm,dma-window property of an OF node into the busno, phys and
>   * size parameters.
>   */
> @@ -81,8 +83,9 @@ extern void of_instantiate_rtc(void);
>  extern int of_get_ibm_chip_id(struct device_node *np);
>  
>  /* The of_drconf_cell struct defines the layout of the LMB array
> - * specified in the device tree property
> - * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory
> + * specified in the device tree properties,
> + * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory
> + * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory-v2
>   */
>  struct of_drconf_cell {
>   u64 base_addr;
> @@ -92,9 +95,39 @@ struct of_drconf_cell {
>   u32 flags;
>  };
>  
> -#define DRCONF_MEM_ASSIGNED  0x0008
> -#define DRCONF_MEM_AI_INVALID0x0040
> -#define DRCONF_MEM_RESERVED  0x0080
> +#define DRCONF_MEM_ASSIGNED  0x0008
> +#define DRCONF_MEM_AI_INVALID0x0040
> +#define DRCONF_MEM_RESERVED  0x0080
> +
> + /* It is important to note that this structure can not
> +  * be safely mapped onto the memory containing the
> +  * 'ibm,dynamic-memory-v2'.  This structure represents
> +  * the order of the fields stored, but compiler alignment
> +  * may insert extra bytes of padding between the fields
> +  * 'num_seq_lmbs' and 'base_addr'.
> +  */

The "packed" attribute should prevent the struct from being padded.

struct of_drconf_cell_v2 {
...
} __attribute__((packed));

or, simply

struct of_drconf_cell_v2 {
...
} __packed;

-Tyrel

> +struct of_drconf_cell_v2 {
> + u32 num_seq_lmbs;
> + u64 base_addr;
> + u32 drc_index;
> + u32 aa_index;
> + u32 flags;
> +};
> +
> +
> +static inline int dyn_mem_v2_len(int entries)
> +{
> + int drconf_v2_cells = (n_mem_addr_cells + 4);
> + int drconf_v2_cells_len = (drconf_v2_cells * sizeof(unsigned int));
> + return (((entries) * drconf_v2_cells_len) +
> +(1 * sizeof(unsigned int)));
> +}
> +
> +extern void read_drconf_cell_v2(struct of_drconf_cell_v2 *drmem,
> + const __be32 **cellp);
> +extern void read_one_drc_info(int **info, char **drc_type, char **drc_name,
> + unsigned long int *fdi_p, unsigned long int *nsl_p,
> + unsigned long int *si_p, unsigned long int *ldi_p);
>  
>  /*
>   * There are two methods for telling firmware what our capabilities are.
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 669a15e..ad294ce 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -405,6 +405,24 @@ static void read_drconf_cell(struct of_drconf_cell 
> *drmem, const __be32 **cellp)
>  
>   *cellp = cp + 4;
>  }
> + 
> + /*
> + * Retrieve and validate the ibm,dynamic-memory property of the device tree.
> + * Read the next memory block set entry from the ibm,dynamic-memory-v2 
> property
> + * and return the information in the provided of_drconf_cell_v2 structure.
> + */
> +void read_drconf_cell_v2(struct of_drconf_cell_v2 *drmem, const __be32 
> **cellp)
> +{
> + const __be32 *cp = (const __be32 *)*cellp;
> + drmem->num_seq_lmbs = be32_to_cpu(*cp++);
> + drmem->base_addr = read_n_cells(n_mem_addr_cells, );
> + drmem->drc_index = be32_to_cpu(*cp++);
> + drmem->aa_index = be32_to_cpu(*cp++);
> + drmem->flags = be32_to_cpu(*cp++);
> +
> + *cellp = cp;
> +}
> +EXPORT_SYMBOL(read_drconf_cell_v2);
>  
>  /*
>   * Retrieve and validate the ibm,dynamic-memory property of the device tree.
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 946e34f..a55bc1e 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -56,6 +56,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -441,12 +442,12 @@ static int __init 
> early_init_dt_scan_chosen_ppc(unsigned long node,
>  
>  #ifdef CONFIG_PPC_PSERIES
>  /*
> - * Interpret the ibm,dynamic-memory property in the
> - * /ibm,dynamic-reconfiguration-memory node.
> + * Interpret the ibm,dynamic-memory 

Re: [PATCH] powernv/pci: Add PHB register dump debugfs handle

2016-07-25 Thread Gavin Shan
On Mon, Jul 25, 2016 at 10:53:49AM -0700, Tyrel Datwyler wrote:
>On 07/21/2016 11:36 PM, Gavin Shan wrote:
>> On Fri, Jul 22, 2016 at 03:23:36PM +1000, Russell Currey wrote:
>>> On EEH events the kernel will print a dump of relevant registers.
>>> If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform
>>> doesn't have EEH support, etc) this information isn't readily available.
>>>
>>> Add a new debugfs handler to trigger a PHB register dump, so that this
>>> information can be made available on demand.
>>>
>>> Signed-off-by: Russell Currey 
>> 
>> Reviewed-by: Gavin Shan 
>> 
>>> ---
>>> arch/powerpc/platforms/powernv/pci-ioda.c | 35 
>>> +++
>>> 1 file changed, 35 insertions(+)
>>>
>>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
>>> b/arch/powerpc/platforms/powernv/pci-ioda.c
>>> index 891fc4a..ada2f3c 100644
>>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>> @@ -3018,6 +3018,38 @@ static void pnv_ioda_setup_pe_seg(struct pnv_ioda_pe 
>>> *pe)
>>> }
>>> }
>>>
>>> +#ifdef CONFIG_DEBUG_FS
>>> +static ssize_t pnv_pci_debug_write(struct file *filp,
>>> +  const char __user *user_buf,
>>> +  size_t count, loff_t *ppos)
>>> +{
>>> +   struct pci_controller *hose = filp->private_data;
>>> +   struct pnv_phb *phb;
>>> +   int ret = 0;
>> 
>> Needn't initialize @ret in advance. The code might be simpler, but it's
>> only a personal preference:
>
>I believe its actually preferred that it not be initialized in advance
>so that the tooling can warn you about conditional code paths where you
>may have forgotten to set a value. Or as Gavin suggests to explicitly
>use error values in the return statements.
>

Yeah, the data type should be int64_t as well.

Thanks,
Gavin

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 12/12] mm: SLUB hardened usercopy support

2016-07-25 Thread Laura Abbott

On 07/25/2016 02:42 PM, Rik van Riel wrote:

On Mon, 2016-07-25 at 12:16 -0700, Laura Abbott wrote:

On 07/20/2016 01:27 PM, Kees Cook wrote:

Under CONFIG_HARDENED_USERCOPY, this adds object size checking to
the
SLUB allocator to catch any copies that may span objects. Includes
a
redzone handling fix discovered by Michael Ellerman.

Based on code from PaX and grsecurity.

Signed-off-by: Kees Cook 
Tested-by: Michael Ellerman 
---
 init/Kconfig |  1 +
 mm/slub.c| 36 
 2 files changed, 37 insertions(+)

diff --git a/init/Kconfig b/init/Kconfig
index 798c2020ee7c..1c4711819dfd 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1765,6 +1765,7 @@ config SLAB

 config SLUB
bool "SLUB (Unqueued Allocator)"
+   select HAVE_HARDENED_USERCOPY_ALLOCATOR
help
   SLUB is a slab allocator that minimizes cache line
usage
   instead of managing queues of cached objects (SLAB
approach).
diff --git a/mm/slub.c b/mm/slub.c
index 825ff4505336..7dee3d9a5843 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3614,6 +3614,42 @@ void *__kmalloc_node(size_t size, gfp_t
flags, int node)
 EXPORT_SYMBOL(__kmalloc_node);
 #endif

+#ifdef CONFIG_HARDENED_USERCOPY
+/*
+ * Rejects objects that are incorrectly sized.
+ *
+ * Returns NULL if check passes, otherwise const char * to name of
cache
+ * to indicate an error.
+ */
+const char *__check_heap_object(const void *ptr, unsigned long n,
+   struct page *page)
+{
+   struct kmem_cache *s;
+   unsigned long offset;
+   size_t object_size;
+
+   /* Find object and usable object size. */
+   s = page->slab_cache;
+   object_size = slab_ksize(s);
+
+   /* Find offset within object. */
+   offset = (ptr - page_address(page)) % s->size;
+
+   /* Adjust for redzone and reject if within the redzone. */
+   if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) {
+   if (offset < s->red_left_pad)
+   return s->name;
+   offset -= s->red_left_pad;
+   }
+
+   /* Allow address range falling entirely within object
size. */
+   if (offset <= object_size && n <= object_size - offset)
+   return NULL;
+
+   return s->name;
+}
+#endif /* CONFIG_HARDENED_USERCOPY */
+


I compared this against what check_valid_pointer does for SLUB_DEBUG
checking. I was hoping we could utilize that function to avoid
duplication but a) __check_heap_object needs to allow accesses
anywhere
in the object, not just the beginning b) accessing page->objects
is racy without the addition of locking in SLUB_DEBUG.

Still, the ptr < page_address(page) check from __check_heap_object
would
be good to add to avoid generating garbage large offsets and trying
to
infer C math.

diff --git a/mm/slub.c b/mm/slub.c
index 7dee3d9..5370e4f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3632,6 +3632,9 @@ const char *__check_heap_object(const void
*ptr, unsigned long n,
 s = page->slab_cache;
 object_size = slab_ksize(s);

+   if (ptr < page_address(page))
+   return s->name;
+
 /* Find offset within object. */
 offset = (ptr - page_address(page)) % s->size;



I don't get it, isn't that already guaranteed because we
look for the page that ptr is in, before __check_heap_object
is called?

Specifically, in patch 3/12:

+   page = virt_to_head_page(ptr);
+
+   /* Check slab allocator for flags and size. */
+   if (PageSlab(page))
+   return __check_heap_object(ptr, n, page);

How can that generate a ptr that is not inside the page?

What am I overlooking?  And, should it be in the changelog or
a comment? :)




I ran into the subtraction issue when the vmalloc detection wasn't
working on ARM64, somehow virt_to_head_page turned into a page
that happened to have PageSlab set. I agree if everything is working
properly this is redundant but given the type of feature this is, a
little bit of redundancy against a system running off into the weeds
or bad patches might be warranted.

I'm not super attached to the check if other maintainers think it
is redundant. Updating the __check_heap_object header comment
with a note of what we are assuming could work

Thanks,
Laura
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 8/8] powerpc: Enable support for new DRC devtree properties

2016-07-25 Thread Michael Bringmann
prom_init.c: Enable support for new DRC device tree properties
"ibm,drc-info" and "ibm,dynamic-memory-v2" in initial handshake
between the Linux kernel and the front end processor.

Signed-off-by: Michael Bringmann 
---
diff -Naur linux-rhel/arch/powerpc/kernel/prom_init.c 
linux-rhel-patch/arch/powerpc/kernel/prom_init.c
--- linux-rhel/arch/powerpc/kernel/prom_init.c  2016-03-03 07:36:25.0 
-0600
+++ linux-rhel-patch/arch/powerpc/kernel/prom_init.c2016-06-20 
15:59:58.016373676 -0500
@@ -695,7 +695,7 @@ unsigned char ibm_architecture_vec[] = {
OV4_MIN_ENT_CAP,/* minimum VP entitled capacity */
 
/* option vector 5: PAPR/OF options */
-   VECTOR_LENGTH(18),  /* length */
+   VECTOR_LENGTH(22),  /* length */
0,  /* don't ignore, don't halt */
OV5_FEAT(OV5_LPAR) | OV5_FEAT(OV5_SPLPAR) | OV5_FEAT(OV5_LARGE_PAGES) |
OV5_FEAT(OV5_DRCONF_MEMORY) | OV5_FEAT(OV5_DONATE_DEDICATE_CPU) |
@@ -728,6 +728,10 @@ unsigned char ibm_architecture_vec[] = {
OV5_FEAT(OV5_PFO_HW_RNG) | OV5_FEAT(OV5_PFO_HW_ENCR) |
OV5_FEAT(OV5_PFO_HW_842),
OV5_FEAT(OV5_SUB_PROCESSORS),
+   0,
+   0,
+   0,
+   OV5_FEAT(OV5_RPS_DM2) | OV5_FEAT(OV5_RPS_DRC_INFO),
 
/* option vector 6: IBM PAPR hints */
VECTOR_LENGTH(3),   /* length */

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 7/8] powerpc: Check arch.vec earlier during boot for memory features

2016-07-25 Thread Michael Bringmann
architecture.vec5 features: The boot-time memory management needs to
know the form of the "ibm,dynamic-memory-v2" property early during
scanning of the flattened device tree.  This patch moves execution of
the function pseries_probe_fw_features() early enough to be before
the scanning of the memory properties in the device tree to allow
recognition of the supported properties.

Signed-off-by: Michael Bringmann 
---
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 9d86c66..e4c5076 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -215,6 +215,8 @@ extern int early_init_dt_scan_opal(unsigned long node, 
const char *uname,
   int depth, void *data);
 extern int early_init_dt_scan_recoverable_ranges(unsigned long node,
 const char *uname, int depth, void *data);
+extern int pseries_probe_fw_features(unsigned long node,
+const char *uname, int depth, void *data);
 
 extern int opal_get_chars(uint32_t vtermno, char *buf, int count);
 extern int opal_put_chars(uint32_t vtermno, const char *buf, int total_len);
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 946e34f..2034edc 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -777,6 +777,7 @@ void __init early_init_devtree(void *params)
of_scan_flat_dt(early_init_dt_scan_chosen_ppc, boot_command_line);
 
/* Scan memory nodes and rebuild MEMBLOCKs */
+   of_scan_flat_dt(pseries_probe_fw_features, NULL);
of_scan_flat_dt(early_init_dt_scan_root, NULL);
of_scan_flat_dt(early_init_dt_scan_memory_ppc, NULL);
 
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 9883bc7..f554205 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -736,7 +736,7 @@ static void pseries_power_off(void)
  * Called very early, MMU is off, device-tree isn't unflattened
  */
 
-static int __init pseries_probe_fw_features(unsigned long node,
+int __init pseries_probe_fw_features(unsigned long node,
const char *uname, int depth,
void *data)
 {
@@ -770,6 +770,7 @@ static int __init pseries_probe_fw_features(unsigned long 
node,
 
return hypertas_found && vec5_found;
 }
+EXPORT_SYMBOL(pseries_probe_fw_features);
 
 static int __init pSeries_probe(void)
 {

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 6/8] hotplug/drc-info: Add code to search new devtree properties

2016-07-25 Thread Michael Bringmann
rpadlpar_core.c: Provide parallel routines to search the older device-
tree properties ("ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types"
and "ibm,drc-power-domains"), or the new property "ibm,drc-info".  The
code searches for PHP PCI Slots, gets the DRC properties within the
current node (using my-drc-index as correlation), and performs searches
by name or type of DRC node.

Signed-off-by: Michael Bringmann 
---
diff --git a/drivers/pci/hotplug/rpadlpar_core.c 
b/drivers/pci/hotplug/rpadlpar_core.c
index dc67f39..bea9723 100644
--- a/drivers/pci/hotplug/rpadlpar_core.c
+++ b/drivers/pci/hotplug/rpadlpar_core.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "../pci.h"
 #include "rpaphp.h"
@@ -44,15 +45,14 @@ static struct device_node *find_vio_slot_node(char 
*drc_name)
 {
struct device_node *parent = of_find_node_by_name(NULL, "vdevice");
struct device_node *dn = NULL;
-   char *name;
int rc;
 
if (!parent)
return NULL;
 
while ((dn = of_get_next_child(parent, dn))) {
-   rc = rpaphp_get_drc_props(dn, NULL, , NULL, NULL);
-   if ((rc == 0) && (!strcmp(drc_name, name)))
+   rc = rpaphp_check_drc_props(dn, drc_name, NULL);
+   if (rc == 0)
break;
}
 
@@ -64,15 +64,12 @@ static struct device_node *find_php_slot_pci_node(char 
*drc_name,
  char *drc_type)
 {
struct device_node *np = NULL;
-   char *name;
-   char *type;
int rc;
 
while ((np = of_find_node_by_name(np, "pci"))) {
-   rc = rpaphp_get_drc_props(np, NULL, , , NULL);
+   rc = rpaphp_check_drc_props(np, drc_name, drc_type);
if (rc == 0)
-   if (!strcmp(drc_name, name) && !strcmp(drc_type, type))
-   break;
+   break;
}
 
return np;
diff --git a/drivers/pci/hotplug/rpaphp.h b/drivers/pci/hotplug/rpaphp.h
index 7db024e..8db5f2e 100644
--- a/drivers/pci/hotplug/rpaphp.h
+++ b/drivers/pci/hotplug/rpaphp.h
@@ -91,8 +91,8 @@ int rpaphp_get_sensor_state(struct slot *slot, int *state);
 
 /* rpaphp_core.c */
 int rpaphp_add_slot(struct device_node *dn);
-int rpaphp_get_drc_props(struct device_node *dn, int *drc_index,
-   char **drc_name, char **drc_type, int *drc_power_domain);
+int rpaphp_check_drc_props(struct device_node *dn, char *drc_name,
+   char *drc_type);
 
 /* rpaphp_slot.c */
 void dealloc_slot_struct(struct slot *slot);
diff --git a/drivers/pci/hotplug/rpaphp_core.c 
b/drivers/pci/hotplug/rpaphp_core.c
index 8d13202..0cfdbd9 100644
--- a/drivers/pci/hotplug/rpaphp_core.c
+++ b/drivers/pci/hotplug/rpaphp_core.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include/* for eeh_add_device() */
 #include   /* rtas_call */
 #include /* for pci_controller */
@@ -142,15 +143,6 @@ static enum pci_bus_speed get_max_bus_speed(struct slot 
*slot)
case 5:
case 6:
speed = PCI_SPEED_33MHz;/* speed for case 1-6 */
-   break;
-   case 7:
-   case 8:
-   speed = PCI_SPEED_66MHz;
-   break;
-   case 11:
-   case 14:
-   speed = PCI_SPEED_66MHz_PCIX;
-   break;
case 12:
case 15:
speed = PCI_SPEED_100MHz_PCIX;
@@ -196,25 +188,21 @@ static int get_children_props(struct device_node *dn, 
const int **drc_indexes,
return 0;
 }
 
-/* To get the DRC props describing the current node, first obtain it's
- * my-drc-index property.  Next obtain the DRC list from it's parent.  Use
- * the my-drc-index for correlation, and obtain the requested properties.
+
+/* Verify the existence of 'drc_name' and/or 'drc_type' within the
+ * current node.  First obtain it's my-drc-index property.  Next,
+ * obtain the DRC info from it's parent.  Use the my-drc-index for
+ * correlation, and obtain/validate the requested properties.
  */
-int rpaphp_get_drc_props(struct device_node *dn, int *drc_index,
-   char **drc_name, char **drc_type, int *drc_power_domain)
+
+static int rpaphp_check_drc_props_v1(struct device_node *dn, char *drc_name,
+   char *drc_type, unsigned int my_index)
 {
+   char *name_tmp, *type_tmp;
const int *indexes, *names;
const int *types, *domains;
-   const unsigned int *my_index;
-   char *name_tmp, *type_tmp;
int i, rc;
 
-   my_index = of_get_property(dn, "ibm,my-drc-index", NULL);
-   if (!my_index) {
-   /* Node isn't DLPAR/hotplug capable */
-   return -EINVAL;
-   }
-
rc = get_children_props(dn->parent, , , , );
if (rc < 0) {
return -EINVAL;
@@ -225,24 +213,83 @@ int rpaphp_get_drc_props(struct device_node *dn, int 

[PATCH 5/8] pseries/drc-info: Search new DRC properties for CPU indexes

2016-07-25 Thread Michael Bringmann
pseries/drc-info: Provide parallel routines to convert between
drc_index and CPU numbers at runtime, using the older device-tree
properties ("ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types"
and "ibm,drc-power-domains"), or the new property "ibm,drc-info".

Signed-off-by: Michael Bringmann 
---
diff --git a/arch/powerpc/platforms/pseries/pseries_energy.c 
b/arch/powerpc/platforms/pseries/pseries_energy.c
index 9276779..10c4200 100644
--- a/arch/powerpc/platforms/pseries/pseries_energy.c
+++ b/arch/powerpc/platforms/pseries/pseries_energy.c
@@ -35,10 +35,68 @@ static int sysfs_entries;
 
 /* Helper Routines to convert between drc_index to cpu numbers */
 
+void read_one_drc_info(int **info, char **dtype, char **dname,
+   unsigned long int *fdi_p, unsigned long int *nsl_p,
+   unsigned long int *si_p, unsigned long int *ldi_p)
+{
+   char *drc_type, *drc_name, *pc;
+   u32 fdi, nsl, si, ldi;
+
+   fdi = nsl = si = ldi = 0;
+
+   /* Get drc-type:encode-string */
+   pc = (char *)info;
+   drc_type = pc;
+   pc += (strlen(drc_type) + 1);
+
+   /* Get drc-name-prefix:encode-string */
+   drc_name = (char *)pc;
+   pc += (strlen(drc_name) + 1);
+
+   /* Get drc-index-start:encode-int */
+   memcpy(, pc, 4);
+   fdi = be32_to_cpu(fdi);
+   pc += 4;
+
+   /* Get/skip drc-name-suffix-start:encode-int */
+   pc += 4;
+
+   /* Get number-sequential-elements:encode-int */
+   memcpy(, pc, 4);
+   nsl = be32_to_cpu(nsl);
+   pc += 4;
+
+   /* Get sequential-increment:encode-int */
+   memcpy(, pc, 4);
+   si = be32_to_cpu(si);
+   pc += 4;
+
+   /* Get/skip drc-power-domain:encode-int */
+   pc += 4;
+
+   /* Should now know end of current entry */
+   ldi = fdi + ((nsl-1)*si);
+
+   (*info) = (int *)pc;
+
+   if (dtype)
+   *dtype = drc_type;
+   if (dname)
+   *dname = drc_name;
+   if (fdi_p)
+   *fdi_p = fdi;
+   if (nsl_p)
+   *nsl_p = nsl;
+   if (si_p)
+   *si_p = si;
+   if (ldi_p)
+   *ldi_p = ldi;
+}
+EXPORT_SYMBOL(read_one_drc_info);
+
 static u32 cpu_to_drc_index(int cpu)
 {
struct device_node *dn = NULL;
-   const int *indexes;
int i;
int rc = 1;
u32 ret = 0;
@@ -46,18 +104,54 @@ static u32 cpu_to_drc_index(int cpu)
dn = of_find_node_by_path("/cpus");
if (dn == NULL)
goto err;
-   indexes = of_get_property(dn, "ibm,drc-indexes", NULL);
-   if (indexes == NULL)
-   goto err_of_node_put;
+
/* Convert logical cpu number to core number */
i = cpu_core_index_of_thread(cpu);
-   /*
-* The first element indexes[0] is the number of drc_indexes
-* returned in the list.  Hence i+1 will get the drc_index
-* corresponding to core number i.
-*/
-   WARN_ON(i > indexes[0]);
-   ret = indexes[i + 1];
+
+   if (firmware_has_feature(FW_FEATURE_RPS_DRC_INFO)) {
+   int *info = (int *)4;
+   unsigned long int num_set_entries, j, iw = i, fdi = 0;
+   unsigned long int ldi = 0, nsl = 0, si = 0;
+   char *dtype;
+   char *dname;
+
+   info = (int *)of_get_property(dn, "ibm,drc-info", NULL);
+   if (info == NULL)
+   goto err_of_node_put;
+
+   num_set_entries = be32_to_cpu(*info++);
+
+   for (j = 0; j < num_set_entries; j++) {
+
+   read_one_drc_info(, , , ,
+   , , );
+   if (strcmp(dtype, "CPU"))
+   goto err;
+
+   if (iw < ldi)
+   break;
+
+   WARN_ON(((iw-fdi)%si) != 0);
+   }
+   WARN_ON((nsl == 0) | (si == 0));
+
+   ret = ldi + (iw*si);
+   } else {
+   const int *indexes;
+
+   indexes = of_get_property(dn, "ibm,drc-indexes", NULL);
+   if (indexes == NULL)
+   goto err_of_node_put;
+
+   /*
+* The first element indexes[0] is the number of drc_indexes
+* returned in the list.  Hence i+1 will get the drc_index
+* corresponding to core number i.
+*/
+   WARN_ON(i > indexes[0]);
+   ret = indexes[i + 1];
+   }
+
rc = 0;
 
 err_of_node_put:
@@ -78,21 +172,51 @@ static int drc_index_to_cpu(u32 drc_index)
dn = of_find_node_by_path("/cpus");
if (dn == NULL)
goto err;
-   indexes = of_get_property(dn, "ibm,drc-indexes", NULL);
-   if (indexes == NULL)
-   goto err_of_node_put;
-   /*
-* First element in the array is the number of 

[PATCH 4/8] pseries/hotplug init: Convert new DRC memory property for hotplug runtime

2016-07-25 Thread Michael Bringmann
hotplug_init: Simplify the code needed for runtime memory hotplug and
maintenance with a conversion routine that transforms the compressed
property "ibm,dynamic-memory-v2" to the form of "ibm,dynamic-memory"
within the "ibm,dynamic-reconfiguration-memory" property.  Thus only
a single set of routines should be required at runtime to parse, edit,
and manipulate the memory representation in the device tree.  Similarly,
any userspace applications that need this information will only need
to recognize the older format to be able to continue to operate.

Signed-off-by: Michael Bringmann 
---
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 2ce1385..f422dcb 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -839,6 +839,95 @@ static int pseries_update_drconf_memory(struct 
of_reconfig_data *pr)
return rc;
 }
 
+static int pseries_rewrite_dynamic_memory_v2(void)
+{
+   unsigned long memblock_size;
+   struct device_node *dn;
+   struct property *prop, *prop_v2;
+   __be32 *p;
+   struct of_drconf_cell *lmbs;
+   u32 num_lmb_desc_sets, num_lmbs;
+   int i;
+
+   dn = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
+   if (!dn)
+   return -EINVAL;
+
+   prop_v2 = of_find_property(dn, "ibm,dynamic-memory-v2", NULL);
+   if (!prop_v2)
+   return -EINVAL;
+
+   memblock_size = pseries_memory_block_size();
+   if (!memblock_size)
+   return -EINVAL;
+
+   /* The first int of the property is the number of lmb sets
+* described by the property.
+*/
+   p = (__be32 *)prop_v2->value;
+   num_lmb_desc_sets = be32_to_cpu(*p++);
+
+   /* Count the number of LMBs for generating the alternate format
+*/
+   for (i = 0, num_lmbs = 0; i < num_lmb_desc_sets; i++) {
+   struct of_drconf_cell_v2 drmem;
+
+   read_drconf_cell_v2(, (const __be32 **));
+   num_lmbs += drmem.num_seq_lmbs;
+   }
+
+   /* Create an empty copy of the new 'ibm,dynamic-memory' property
+*/
+   {
+   prop = kzalloc(sizeof(*prop), GFP_KERNEL);
+   if (!prop)
+   return -ENOMEM;
+   prop->name = kstrdup("ibm,dynamic-memory", GFP_KERNEL);
+   prop->length = dyn_mem_v2_len(num_lmbs);
+   prop->value = kzalloc(prop->length, GFP_KERNEL);
+   }
+
+   /* Copy/expand the ibm,dynamic-memory-v2 format to produce the
+* ibm,dynamic-memory format.
+*/
+   p = (__be32 *)prop->value;
+   *p = cpu_to_be32(num_lmbs);
+   p++;
+   lmbs = (struct of_drconf_cell *)p;
+
+   p = (__be32 *)prop_v2->value;
+   p++;
+
+   for (i = 0; i < num_lmb_desc_sets; i++) {
+   struct of_drconf_cell_v2 drmem;
+   int j, k = 0;
+
+   read_drconf_cell_v2(, (const __be32 **));
+
+   for (j = 0; j < drmem.num_seq_lmbs; j++) {
+   lmbs[k+j].base_addr = be64_to_cpu(drmem.base_addr);
+   lmbs[k+j].drc_index = be32_to_cpu(drmem.drc_index);
+   lmbs[k+j].reserved  = 0;
+   lmbs[k+j].aa_index  = be32_to_cpu(drmem.aa_index);
+   lmbs[k+i].flags = be32_to_cpu(drmem.flags);
+
+   drmem.base_addr += memblock_size;
+   drmem.drc_index++;
+   }
+
+   k += drmem.num_seq_lmbs;
+   }
+
+   of_remove_property(dn, prop_v2);
+
+   of_add_property(dn, prop);
+
+   /* And disable feature flag since the property has gone away */
+   powerpc_firmware_features &= ~FW_FEATURE_RPS_DM2;
+
+   return 0;
+}
+
 static int pseries_memory_notifier(struct notifier_block *nb,
   unsigned long action, void *data)
 {
@@ -866,6 +952,8 @@ static struct notifier_block pseries_mem_nb = {
 
 static int __init pseries_memory_hotplug_init(void)
 {
+   if (firmware_has_feature(FW_FEATURE_RPS_DM2))
+   pseries_rewrite_dynamic_memory_v2();
if (firmware_has_feature(FW_FEATURE_LPAR))
of_reconfig_notifier_register(_mem_nb);
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/8] powerpc/memory: Parse new memory property to initialize structures.

2016-07-25 Thread Michael Bringmann
powerpc/memory: Add parallel routines to parse the new property
"ibm,dynamic-memory-v2" property when it is present, and then to
finish initialization of the relevant memory structures with the
operating system.  This code is shared between the boot-time
initialization functions and the runtime functions for memory
hotplug, so it needs to be able to handle both formats.

Signed-off-by: Michael Bringmann 
---
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 669a15e..18b4ee7 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -57,8 +57,10 @@
 EXPORT_SYMBOL(node_data);
 
 static int min_common_depth;
-static int n_mem_addr_cells, n_mem_size_cells;
+int n_mem_addr_cells;
+static int n_mem_size_cells;
 static int form1_affinity;
+EXPORT_SYMBOL(n_mem_addr_cells);
 
 #define MAX_DISTANCE_REF_POINTS 4
 static int distance_ref_points_depth;
@@ -405,9 +407,8 @@
 
*cellp = cp + 4;
 }
- 
- /*
- * Retrieve and validate the ibm,dynamic-memory property of the device tree.
+
+/*
  * Read the next memory block set entry from the ibm,dynamic-memory-v2 property
  * and return the information in the provided of_drconf_cell_v2 structure.
  */
@@ -425,30 +426,55 @@
 EXPORT_SYMBOL(read_drconf_cell_v2);
 
 /*
- * Retrieve and validate the ibm,dynamic-memory property of the device tree.
+ * Retrieve and validate the ibm,dynamic-memory[-v2] property of the
+ * device tree.
+ *
+ * The layout of the ibm,dynamic-memory property is a number N of memory
+ * block description list entries followed by N memory block description
+ * list entries.  Each memory block description list entry contains
+ * information as laid out in the of_drconf_cell struct above.
  *
- * The layout of the ibm,dynamic-memory property is a number N of memblock
- * list entries followed by N memblock list entries.  Each memblock list entry
- * contains information as laid out in the of_drconf_cell struct above.
+ * The layout of the ibm,dynamic-memory-v2 property is a number N of memory
+ * block set description list entries, followed by N memory block set
+ * description set entries.
  */
 static int of_get_drconf_memory(struct device_node *memory, const __be32 **dm)
 {
const __be32 *prop;
u32 len, entries;
 
-   prop = of_get_property(memory, "ibm,dynamic-memory", );
-   if (!prop || len < sizeof(unsigned int))
-   return 0;
+   if (firmware_has_feature(FW_FEATURE_RPS_DM2)) {
 
-   entries = of_read_number(prop++, 1);
+   prop = of_get_property(memory, "ibm,dynamic-memory-v2", );
+   if (!prop || len < sizeof(unsigned int))
+   return 0;
 
-   /* Now that we know the number of entries, revalidate the size
-* of the property read in to ensure we have everything
-*/
-   if (len < (entries * (n_mem_addr_cells + 4) + 1) * sizeof(unsigned int))
-   return 0;
+   entries = of_read_number(prop++, 1);
+
+   /* Now that we know the number of set entries, revalidate the
+* size of the property read in to ensure we have everything.
+*/
+   if (len < dyn_mem_v2_len(entries))
+   return 0;
+
+   *dm = prop;
+   } else {
+   prop = of_get_property(memory, "ibm,dynamic-memory", );
+   if (!prop || len < sizeof(unsigned int))
+   return 0;
+
+   entries = of_read_number(prop++, 1);
+
+   /* Now that we know the number of entries, revalidate the size
+* of the property read in to ensure we have everything
+*/
+   if (len < (entries * (n_mem_addr_cells + 4) + 1) *
+  sizeof(unsigned int))
+   return 0;
+
+   *dm = prop;
+   }
 
-   *dm = prop;
return entries;
 }
 
@@ -511,7 +537,7 @@
  * This is like of_node_to_nid_single() for memory represented in the
  * ibm,dynamic-reconfiguration-memory node.
  */
-static int of_drconf_to_nid_single(struct of_drconf_cell *drmem,
+static int of_drconf_to_nid_single(u32 drmem_flags, u32 drmem_aa_index,
   struct assoc_arrays *aa)
 {
int default_nid = 0;
@@ -519,16 +545,16 @@
int index;
 
if (min_common_depth > 0 && min_common_depth <= aa->array_sz &&
-   !(drmem->flags & DRCONF_MEM_AI_INVALID) &&
-   drmem->aa_index < aa->n_arrays) {
-   index = drmem->aa_index * aa->array_sz + min_common_depth - 1;
+   !(drmem_flags & DRCONF_MEM_AI_INVALID) &&
+   drmem_aa_index < aa->n_arrays) {
+   index = drmem_aa_index * aa->array_sz + min_common_depth - 1;
nid = of_read_number(>arrays[index], 1);
 
if (nid == 0x || nid >= MAX_NUMNODES)
nid = default_nid;
 
if (nid > 0) {
-   

[PATCH 2/8] powerpc/memory: Parse new memory property to register blocks.

2016-07-25 Thread Michael Bringmann
powerpc/memory: Add parallel routines to parse the new property
"ibm,dynamic-memory-v2" property when it is present, and then to
register the relevant memory blocks with the operating system.
This property format is intended to provide a more compact
representation of memory when communicating with the front end
processor, especially when describing vast amounts of RAM.

Signed-off-by: Michael Bringmann 
---
diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 7f436ba..b9a1534 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -69,6 +69,8 @@ struct boot_param_header {
  * OF address retreival & translation
  */
 
+extern int n_mem_addr_cells;
+
 /* Parse the ibm,dma-window property of an OF node into the busno, phys and
  * size parameters.
  */
@@ -81,8 +83,9 @@ extern void of_instantiate_rtc(void);
 extern int of_get_ibm_chip_id(struct device_node *np);
 
 /* The of_drconf_cell struct defines the layout of the LMB array
- * specified in the device tree property
- * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory
+ * specified in the device tree properties,
+ * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory
+ * ibm,dynamic-reconfiguration-memory/ibm,dynamic-memory-v2
  */
 struct of_drconf_cell {
u64 base_addr;
@@ -92,9 +95,39 @@ struct of_drconf_cell {
u32 flags;
 };
 
-#define DRCONF_MEM_ASSIGNED0x0008
-#define DRCONF_MEM_AI_INVALID  0x0040
-#define DRCONF_MEM_RESERVED0x0080
+#define DRCONF_MEM_ASSIGNED0x0008
+#define DRCONF_MEM_AI_INVALID  0x0040
+#define DRCONF_MEM_RESERVED0x0080
+
+   /* It is important to note that this structure can not
+* be safely mapped onto the memory containing the
+* 'ibm,dynamic-memory-v2'.  This structure represents
+* the order of the fields stored, but compiler alignment
+* may insert extra bytes of padding between the fields
+* 'num_seq_lmbs' and 'base_addr'.
+*/
+struct of_drconf_cell_v2 {
+   u32 num_seq_lmbs;
+   u64 base_addr;
+   u32 drc_index;
+   u32 aa_index;
+   u32 flags;
+};
+
+
+static inline int dyn_mem_v2_len(int entries)
+{
+   int drconf_v2_cells = (n_mem_addr_cells + 4);
+   int drconf_v2_cells_len = (drconf_v2_cells * sizeof(unsigned int));
+   return (((entries) * drconf_v2_cells_len) +
+(1 * sizeof(unsigned int)));
+}
+
+extern void read_drconf_cell_v2(struct of_drconf_cell_v2 *drmem,
+   const __be32 **cellp);
+extern void read_one_drc_info(int **info, char **drc_type, char **drc_name,
+   unsigned long int *fdi_p, unsigned long int *nsl_p,
+   unsigned long int *si_p, unsigned long int *ldi_p);
 
 /*
  * There are two methods for telling firmware what our capabilities are.
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 669a15e..ad294ce 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -405,6 +405,24 @@ static void read_drconf_cell(struct of_drconf_cell *drmem, 
const __be32 **cellp)
 
*cellp = cp + 4;
 }
+ 
+ /*
+ * Retrieve and validate the ibm,dynamic-memory property of the device tree.
+ * Read the next memory block set entry from the ibm,dynamic-memory-v2 property
+ * and return the information in the provided of_drconf_cell_v2 structure.
+ */
+void read_drconf_cell_v2(struct of_drconf_cell_v2 *drmem, const __be32 **cellp)
+{
+   const __be32 *cp = (const __be32 *)*cellp;
+   drmem->num_seq_lmbs = be32_to_cpu(*cp++);
+   drmem->base_addr = read_n_cells(n_mem_addr_cells, );
+   drmem->drc_index = be32_to_cpu(*cp++);
+   drmem->aa_index = be32_to_cpu(*cp++);
+   drmem->flags = be32_to_cpu(*cp++);
+
+   *cellp = cp;
+}
+EXPORT_SYMBOL(read_drconf_cell_v2);
 
 /*
  * Retrieve and validate the ibm,dynamic-memory property of the device tree.
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 946e34f..a55bc1e 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -56,6 +56,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -441,12 +442,12 @@ static int __init early_init_dt_scan_chosen_ppc(unsigned 
long node,
 
 #ifdef CONFIG_PPC_PSERIES
 /*
- * Interpret the ibm,dynamic-memory property in the
- * /ibm,dynamic-reconfiguration-memory node.
+ * Interpret the ibm,dynamic-memory property/ibm,dynamic-memory-v2
+ * in the /ibm,dynamic-reconfiguration-memory node.
  * This contains a list of memory blocks along with NUMA affinity
  * information.
  */
-static int __init early_init_dt_scan_drconf_memory(unsigned long node)
+static int __init early_init_dt_scan_drconf_memory_v1(unsigned long node)
 {
const __be32 *dm, *ls, *usm;
int l;
@@ -516,6 +517,105 @@ static int __init 
early_init_dt_scan_drconf_memory(unsigned long node)

[PATCH 1/8] powerpc/firmware: Add definitions for new firmware features.

2016-07-25 Thread Michael Bringmann
Firmware Features: Define new bit flags representing the presence of
new device tree properties "ibm,drc-info", and "ibm,dynamic-memory-v2".
These flags are used to tell the front end processor when the Linux
kernel supports the new properties, and by the front end processor to
tell the Linux kernel that the new properties are present in the devie
tree.

Signed-off-by: Michael Bringmann 
---
diff --git a/arch/powerpc/include/asm/firmware.h 
b/arch/powerpc/include/asm/firmware.h
index b062924..a9d66d5 100644
--- a/arch/powerpc/include/asm/firmware.h
+++ b/arch/powerpc/include/asm/firmware.h
@@ -51,6 +51,8 @@
 #define FW_FEATURE_BEST_ENERGY ASM_CONST(0x8000)
 #define FW_FEATURE_TYPE1_AFFINITY ASM_CONST(0x0001)
 #define FW_FEATURE_PRRNASM_CONST(0x0002)
+#define FW_FEATURE_RPS_DM2 ASM_CONST(0x0004)
+#define FW_FEATURE_RPS_DRC_INFOASM_CONST(0x0008)
 
 #ifndef __ASSEMBLY__
 
@@ -66,7 +68,8 @@ enum {
FW_FEATURE_MULTITCE | FW_FEATURE_SPLPAR | FW_FEATURE_LPAR |
FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO |
FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
-   FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN,
+   FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
+   FW_FEATURE_RPS_DM2 | FW_FEATURE_RPS_DRC_INFO,
FW_FEATURE_PSERIES_ALWAYS = 0,
FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
FW_FEATURE_POWERNV_ALWAYS = 0,
diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 7f436ba..b9a1534 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -155,6 +203,8 @@ struct of_drconf_cell {
 #define OV5_PFO_HW_842 0x0E40  /* PFO Compression Accelerator */
 #define OV5_PFO_HW_ENCR0x0E20  /* PFO Encryption Accelerator */
 #define OV5_SUB_PROCESSORS 0x0F01  /* 1,2,or 4 Sub-Processors supported */
+#define OV5_RPS_DM20x1680  /* Redef Prop Structures: dyn-mem-v2 */
+#define OV5_RPS_DRC_INFO   0x1640  /* Redef Prop Structures: drc-info   */
 
 /* Option Vector 6: IBM PAPR hints */
 #define OV6_LINUX  0x02/* Linux is our OS */
diff --git a/arch/powerpc/platforms/pseries/firmware.c 
b/arch/powerpc/platforms/pseries/firmware.c
index 8c80588..00243ee 100644
--- a/arch/powerpc/platforms/pseries/firmware.c
+++ b/arch/powerpc/platforms/pseries/firmware.c
@@ -111,6 +111,8 @@ static __initdata struct vec5_fw_feature
 vec5_fw_features_table[] = {
{FW_FEATURE_TYPE1_AFFINITY, OV5_TYPE1_AFFINITY},
{FW_FEATURE_PRRN,   OV5_PRRN},
+   {FW_FEATURE_RPS_DM2,OV5_RPS_DM2},
+   {FW_FEATURE_RPS_DRC_INFO,   OV5_RPS_DRC_INFO},
 };
 
 void __init fw_vec5_feature_init(const char *vec5, unsigned long len)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/8] powerpc/devtree: Add support for 2 new DRC properties

2016-07-25 Thread Michael Bringmann
Several properties in the DRC device tree format are replaced by
more compact representations to allow, for example, for the encoding
of vast amounts of memory, and or reduced duplication of information
in related data structures.

"ibm,drc-info": This property, when present, replaces the following
four properties: "ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types"
and "ibm,drc-power-domains".  This property is defined for all
dynamically reconfigurable platform nodes.  The "ibm,drc-info" elements
are intended to provide a more compact representation, and reduce some
search overhead.

"ibm,dynamic-memory-v2": This property replaces the "ibm,dynamic-memory"
node representation within the "ibm,dynamic-reconfiguration-memory"
property provided by the BMC.  This element format is intended to provide
a more compact representation of memory, especially, for systems with
massive amounts of RAM.  To simplify portability, this property is
converted to the "ibm,dynamic-memory" property during system boot.

"ibm,architecture.vec": Bit flags are added to this data structure
by the front end processor to inform the kernel as to whether to expect
the changes to one or both of the device tree structures "ibm,drc-info"
and "ibm,dynamic-memory-v2".

Signed-off-by: Michael Bringmann 

Michael Bringmann (8):
  powerpc/firmware: Add definitions for new firmware features.
  powerpc/memory: Parse new memory property to register blocks.
  powerpc/memory: Parse new memory property to initialize structures.
  pseries/hotplug init: Convert new DRC memory property for hotplug runtime
  pseries/drc-info: Search new DRC properties for CPU indexes
  hotplug/drc-info: Add code to search new devtree properties
  powerpc: Check arch.vec earlier during boot for memory features
  powerpc: Enable support for new DRC devtree properties


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 12/12] mm: SLUB hardened usercopy support

2016-07-25 Thread Rik van Riel
On Mon, 2016-07-25 at 12:16 -0700, Laura Abbott wrote:
> On 07/20/2016 01:27 PM, Kees Cook wrote:
> > Under CONFIG_HARDENED_USERCOPY, this adds object size checking to
> > the
> > SLUB allocator to catch any copies that may span objects. Includes
> > a
> > redzone handling fix discovered by Michael Ellerman.
> > 
> > Based on code from PaX and grsecurity.
> > 
> > Signed-off-by: Kees Cook 
> > Tested-by: Michael Ellerman 
> > ---
> >  init/Kconfig |  1 +
> >  mm/slub.c| 36 
> >  2 files changed, 37 insertions(+)
> > 
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 798c2020ee7c..1c4711819dfd 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1765,6 +1765,7 @@ config SLAB
> > 
> >  config SLUB
> >     bool "SLUB (Unqueued Allocator)"
> > +   select HAVE_HARDENED_USERCOPY_ALLOCATOR
> >     help
> >        SLUB is a slab allocator that minimizes cache line
> > usage
> >        instead of managing queues of cached objects (SLAB
> > approach).
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 825ff4505336..7dee3d9a5843 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -3614,6 +3614,42 @@ void *__kmalloc_node(size_t size, gfp_t
> > flags, int node)
> >  EXPORT_SYMBOL(__kmalloc_node);
> >  #endif
> > 
> > +#ifdef CONFIG_HARDENED_USERCOPY
> > +/*
> > + * Rejects objects that are incorrectly sized.
> > + *
> > + * Returns NULL if check passes, otherwise const char * to name of
> > cache
> > + * to indicate an error.
> > + */
> > +const char *__check_heap_object(const void *ptr, unsigned long n,
> > +   struct page *page)
> > +{
> > +   struct kmem_cache *s;
> > +   unsigned long offset;
> > +   size_t object_size;
> > +
> > +   /* Find object and usable object size. */
> > +   s = page->slab_cache;
> > +   object_size = slab_ksize(s);
> > +
> > +   /* Find offset within object. */
> > +   offset = (ptr - page_address(page)) % s->size;
> > +
> > +   /* Adjust for redzone and reject if within the redzone. */
> > +   if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) {
> > +   if (offset < s->red_left_pad)
> > +   return s->name;
> > +   offset -= s->red_left_pad;
> > +   }
> > +
> > +   /* Allow address range falling entirely within object
> > size. */
> > +   if (offset <= object_size && n <= object_size - offset)
> > +   return NULL;
> > +
> > +   return s->name;
> > +}
> > +#endif /* CONFIG_HARDENED_USERCOPY */
> > +
> 
> I compared this against what check_valid_pointer does for SLUB_DEBUG
> checking. I was hoping we could utilize that function to avoid
> duplication but a) __check_heap_object needs to allow accesses
> anywhere
> in the object, not just the beginning b) accessing page->objects
> is racy without the addition of locking in SLUB_DEBUG.
> 
> Still, the ptr < page_address(page) check from __check_heap_object
> would
> be good to add to avoid generating garbage large offsets and trying
> to
> infer C math.
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 7dee3d9..5370e4f 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3632,6 +3632,9 @@ const char *__check_heap_object(const void
> *ptr, unsigned long n,
>  s = page->slab_cache;
>  object_size = slab_ksize(s);
>   
> +   if (ptr < page_address(page))
> +   return s->name;
> +
>  /* Find offset within object. */
>  offset = (ptr - page_address(page)) % s->size;
> 

I don't get it, isn't that already guaranteed because we
look for the page that ptr is in, before __check_heap_object
is called?

Specifically, in patch 3/12:

+   page = virt_to_head_page(ptr);
+
+   /* Check slab allocator for flags and size. */
+   if (PageSlab(page))
+   return __check_heap_object(ptr, n, page);

How can that generate a ptr that is not inside the page?

What am I overlooking?  And, should it be in the changelog or
a comment? :)

-- 

All Rights Reversed.

signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 12/12] mm: SLUB hardened usercopy support

2016-07-25 Thread Kees Cook
On Mon, Jul 25, 2016 at 12:16 PM, Laura Abbott  wrote:
> On 07/20/2016 01:27 PM, Kees Cook wrote:
>>
>> Under CONFIG_HARDENED_USERCOPY, this adds object size checking to the
>> SLUB allocator to catch any copies that may span objects. Includes a
>> redzone handling fix discovered by Michael Ellerman.
>>
>> Based on code from PaX and grsecurity.
>>
>> Signed-off-by: Kees Cook 
>> Tested-by: Michael Ellerman 
>> ---
>>  init/Kconfig |  1 +
>>  mm/slub.c| 36 
>>  2 files changed, 37 insertions(+)
>>
>> diff --git a/init/Kconfig b/init/Kconfig
>> index 798c2020ee7c..1c4711819dfd 100644
>> --- a/init/Kconfig
>> +++ b/init/Kconfig
>> @@ -1765,6 +1765,7 @@ config SLAB
>>
>>  config SLUB
>> bool "SLUB (Unqueued Allocator)"
>> +   select HAVE_HARDENED_USERCOPY_ALLOCATOR
>> help
>>SLUB is a slab allocator that minimizes cache line usage
>>instead of managing queues of cached objects (SLAB approach).
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 825ff4505336..7dee3d9a5843 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -3614,6 +3614,42 @@ void *__kmalloc_node(size_t size, gfp_t flags, int
>> node)
>>  EXPORT_SYMBOL(__kmalloc_node);
>>  #endif
>>
>> +#ifdef CONFIG_HARDENED_USERCOPY
>> +/*
>> + * Rejects objects that are incorrectly sized.
>> + *
>> + * Returns NULL if check passes, otherwise const char * to name of cache
>> + * to indicate an error.
>> + */
>> +const char *__check_heap_object(const void *ptr, unsigned long n,
>> +   struct page *page)
>> +{
>> +   struct kmem_cache *s;
>> +   unsigned long offset;
>> +   size_t object_size;
>> +
>> +   /* Find object and usable object size. */
>> +   s = page->slab_cache;
>> +   object_size = slab_ksize(s);
>> +
>> +   /* Find offset within object. */
>> +   offset = (ptr - page_address(page)) % s->size;
>> +
>> +   /* Adjust for redzone and reject if within the redzone. */
>> +   if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) {
>> +   if (offset < s->red_left_pad)
>> +   return s->name;
>> +   offset -= s->red_left_pad;
>> +   }
>> +
>> +   /* Allow address range falling entirely within object size. */
>> +   if (offset <= object_size && n <= object_size - offset)
>> +   return NULL;
>> +
>> +   return s->name;
>> +}
>> +#endif /* CONFIG_HARDENED_USERCOPY */
>> +
>
>
> I compared this against what check_valid_pointer does for SLUB_DEBUG
> checking. I was hoping we could utilize that function to avoid
> duplication but a) __check_heap_object needs to allow accesses anywhere
> in the object, not just the beginning b) accessing page->objects
> is racy without the addition of locking in SLUB_DEBUG.
>
> Still, the ptr < page_address(page) check from __check_heap_object would
> be good to add to avoid generating garbage large offsets and trying to
> infer C math.
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 7dee3d9..5370e4f 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3632,6 +3632,9 @@ const char *__check_heap_object(const void *ptr,
> unsigned long n,
> s = page->slab_cache;
> object_size = slab_ksize(s);
>  +   if (ptr < page_address(page))
> +   return s->name;
> +
> /* Find offset within object. */
> offset = (ptr - page_address(page)) % s->size;
>
> With that, you can add
>
> Reviwed-by: Laura Abbott 

Cool, I'll add that.

Should I add your reviewed-by for this patch only or for the whole series?

Thanks!

-Kees

>
>>  static size_t __ksize(const void *object)
>>  {
>> struct page *page;
>>
>
> Thanks,
> Laura



-- 
Kees Cook
Chrome OS & Brillo Security
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [Patch v3 1/3] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe

2016-07-25 Thread Jason Cooper
Hi Zhao Qiang,

On Mon, Jul 25, 2016 at 04:59:54PM +0800, Zhao Qiang wrote:
> move the driver from drivers/soc/fsl/qe to drivers/irqchip,
> merge qe_ic.h and qe_ic.c into irq-qeic.c.
> 
> Signed-off-by: Zhao Qiang 
> ---
> Changes for v2:
>   - modify the subject and commit msg
> Changes for v3:
>   - merge .h file to .c, rename it with irq-qeic.c
> 
>  drivers/irqchip/Makefile   |   1 +
>  drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} |  82 +++-
>  drivers/soc/fsl/qe/Makefile|   2 +-
>  drivers/soc/fsl/qe/qe_ic.h | 103 
> -
>  4 files changed, 83 insertions(+), 105 deletions(-)
>  rename drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (85%)
>  delete mode 100644 drivers/soc/fsl/qe/qe_ic.h
> 
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 38853a1..cef999d 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -69,3 +69,4 @@ obj-$(CONFIG_PIC32_EVIC)+= irq-pic32-evic.o
>  obj-$(CONFIG_MVEBU_ODMI) += irq-mvebu-odmi.o
>  obj-$(CONFIG_LS_SCFG_MSI)+= irq-ls-scfg-msi.o
>  obj-$(CONFIG_EZNPS_GIC)  += irq-eznps.o
> +obj-$(CONFIG_QUICC_ENGINE)   += qe_ic.o

Did you test this? ;-)

> diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/irqchip/irq-qeic.c
> similarity index 85%
> rename from drivers/soc/fsl/qe/qe_ic.c
> rename to drivers/irqchip/irq-qeic.c
> index ec2ca86..1f91225 100644
> --- a/drivers/soc/fsl/qe/qe_ic.c
> +++ b/drivers/irqchip/irq-qeic.c
> @@ -30,7 +30,87 @@
>  #include 
>  #include 
>  
> -#include "qe_ic.h"
> +#define NR_QE_IC_INTS64
> +
> +/* QE IC registers offset */
> +#define QEIC_CICR0x00
> +#define QEIC_CIVEC   0x04
> +#define QEIC_CRIPNR  0x08
> +#define QEIC_CIPNR   0x0c
> +#define QEIC_CIPXCC  0x10
> +#define QEIC_CIPYCC  0x14
> +#define QEIC_CIPWCC  0x18
> +#define QEIC_CIPZCC  0x1c
> +#define QEIC_CIMR0x20
> +#define QEIC_CRIMR   0x24
> +#define QEIC_CICNR   0x28
> +#define QEIC_CIPRTA  0x30
> +#define QEIC_CIPRTB  0x34
> +#define QEIC_CRICR   0x3c
> +#define QEIC_CHIVEC  0x60
> +
> +/* Interrupt priority registers */
> +#define CIPCC_SHIFT_PRI0 29
> +#define CIPCC_SHIFT_PRI1 26
> +#define CIPCC_SHIFT_PRI2 23
> +#define CIPCC_SHIFT_PRI3 20
> +#define CIPCC_SHIFT_PRI4 13
> +#define CIPCC_SHIFT_PRI5 10
> +#define CIPCC_SHIFT_PRI6 7
> +#define CIPCC_SHIFT_PRI7 4
> +
> +/* CICR priority modes */
> +#define CICR_GWCC0x0004
> +#define CICR_GXCC0x0002
> +#define CICR_GYCC0x0001
> +#define CICR_GZCC0x0008
> +#define CICR_GRTA0x0020
> +#define CICR_GRTB0x0040
> +#define CICR_HPIT_SHIFT  8
> +#define CICR_HPIT_MASK   0x0300
> +#define CICR_HP_SHIFT24
> +#define CICR_HP_MASK 0x3f00
> +
> +/* CICNR */
> +#define CICNR_WCC1T_SHIFT20
> +#define CICNR_ZCC1T_SHIFT28
> +#define CICNR_YCC1T_SHIFT12
> +#define CICNR_XCC1T_SHIFT4
> +
> +/* CRICR */
> +#define CRICR_RTA1T_SHIFT20
> +#define CRICR_RTB1T_SHIFT28
> +
> +/* Signal indicator */
> +#define SIGNAL_MASK  3
> +#define SIGNAL_HIGH  2
> +#define SIGNAL_LOW   0
> +
> +struct qe_ic {
> + /* Control registers offset */
> + volatile u32 __iomem *regs;
> +
> + /* The remapper for this QEIC */
> + struct irq_domain *irqhost;
> +
> + /* The "linux" controller struct */
> + struct irq_chip hc_irq;
> +
> + /* VIRQ numbers of QE high/low irqs */
> + unsigned int virq_high;
> + unsigned int virq_low;
> +};
> +
> +/*
> + * QE interrupt controller internal structure
> + */
> +struct qe_ic_info {
> + u32 mask; /* location of this source at the QIMR register. */
> + u32 mask_reg; /* Mask register offset */
> + u8  pri_code; /* for grouped interrupts sources - the interrupt
> +  code as appears at the group priority register */
> + u32 pri_reg;  /* Group priority register offset */
> +};

Please, no tail comments.  Refer to KernelDoc.

>  
>  static DEFINE_RAW_SPINLOCK(qe_ic_lock);
>  
> diff --git a/drivers/soc/fsl/qe/Makefile b/drivers/soc/fsl/qe/Makefile
> index 2031d38..51e4726 100644
> --- a/drivers/soc/fsl/qe/Makefile
> +++ b/drivers/soc/fsl/qe/Makefile
> @@ -1,7 +1,7 @@
>  #
>  # Makefile for the linux ppc-specific parts of QE
>  #
> -obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_ic.o qe_io.o
> +obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_io.o
>  obj-$(CONFIG_CPM)+= qe_common.o
>  obj-$(CONFIG_UCC)+= ucc.o
>  obj-$(CONFIG_UCC_SLOW)   += ucc_slow.o
> diff --git a/drivers/soc/fsl/qe/qe_ic.h b/drivers/soc/fsl/qe/qe_ic.h
> deleted 

Re: [PATCH v4 12/12] mm: SLUB hardened usercopy support

2016-07-25 Thread Laura Abbott

On 07/20/2016 01:27 PM, Kees Cook wrote:

Under CONFIG_HARDENED_USERCOPY, this adds object size checking to the
SLUB allocator to catch any copies that may span objects. Includes a
redzone handling fix discovered by Michael Ellerman.

Based on code from PaX and grsecurity.

Signed-off-by: Kees Cook 
Tested-by: Michael Ellerman 
---
 init/Kconfig |  1 +
 mm/slub.c| 36 
 2 files changed, 37 insertions(+)

diff --git a/init/Kconfig b/init/Kconfig
index 798c2020ee7c..1c4711819dfd 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1765,6 +1765,7 @@ config SLAB

 config SLUB
bool "SLUB (Unqueued Allocator)"
+   select HAVE_HARDENED_USERCOPY_ALLOCATOR
help
   SLUB is a slab allocator that minimizes cache line usage
   instead of managing queues of cached objects (SLAB approach).
diff --git a/mm/slub.c b/mm/slub.c
index 825ff4505336..7dee3d9a5843 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3614,6 +3614,42 @@ void *__kmalloc_node(size_t size, gfp_t flags, int node)
 EXPORT_SYMBOL(__kmalloc_node);
 #endif

+#ifdef CONFIG_HARDENED_USERCOPY
+/*
+ * Rejects objects that are incorrectly sized.
+ *
+ * Returns NULL if check passes, otherwise const char * to name of cache
+ * to indicate an error.
+ */
+const char *__check_heap_object(const void *ptr, unsigned long n,
+   struct page *page)
+{
+   struct kmem_cache *s;
+   unsigned long offset;
+   size_t object_size;
+
+   /* Find object and usable object size. */
+   s = page->slab_cache;
+   object_size = slab_ksize(s);
+
+   /* Find offset within object. */
+   offset = (ptr - page_address(page)) % s->size;
+
+   /* Adjust for redzone and reject if within the redzone. */
+   if (kmem_cache_debug(s) && s->flags & SLAB_RED_ZONE) {
+   if (offset < s->red_left_pad)
+   return s->name;
+   offset -= s->red_left_pad;
+   }
+
+   /* Allow address range falling entirely within object size. */
+   if (offset <= object_size && n <= object_size - offset)
+   return NULL;
+
+   return s->name;
+}
+#endif /* CONFIG_HARDENED_USERCOPY */
+


I compared this against what check_valid_pointer does for SLUB_DEBUG
checking. I was hoping we could utilize that function to avoid
duplication but a) __check_heap_object needs to allow accesses anywhere
in the object, not just the beginning b) accessing page->objects
is racy without the addition of locking in SLUB_DEBUG.

Still, the ptr < page_address(page) check from __check_heap_object would
be good to add to avoid generating garbage large offsets and trying to
infer C math.

diff --git a/mm/slub.c b/mm/slub.c
index 7dee3d9..5370e4f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3632,6 +3632,9 @@ const char *__check_heap_object(const void *ptr, unsigned 
long n,
s = page->slab_cache;
object_size = slab_ksize(s);
 
+   if (ptr < page_address(page))

+   return s->name;
+
/* Find offset within object. */
offset = (ptr - page_address(page)) % s->size;
 


With that, you can add

Reviwed-by: Laura Abbott 


 static size_t __ksize(const void *object)
 {
struct page *page;



Thanks,
Laura
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH v2] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe

2016-07-25 Thread Qiang Zhao
On Thu, Jul 07, 2016 at 10:25PM , Jason Cooper  wrote:
> -Original Message-
> From: Jason Cooper [mailto:ja...@lakedaemon.net]
> Sent: Thursday, July 07, 2016 10:25 PM
> To: Qiang Zhao 
> Cc: o...@buserror.net; t...@linutronix.de; marc.zyng...@arm.com; linuxppc-
> d...@lists.ozlabs.org; linux-ker...@vger.kernel.org; Xiaobo Xie
> 
> Subject: Re: [PATCH v2] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe
> 
> Hi Zhao Qiang,
> 
> On Thu, Jul 07, 2016 at 09:23:55AM +0800, Zhao Qiang wrote:
> > The driver stays the same.
> >
> > Signed-off-by: Zhao Qiang 
> > ---
> > Changes for v2:
> > - modify the subject and commit msg
> >
> >  drivers/irqchip/Makefile| 1 +
> >  drivers/{soc/fsl/qe => irqchip}/qe_ic.c | 0  drivers/{soc/fsl/qe =>
> > irqchip}/qe_ic.h | 0
> >  drivers/soc/fsl/qe/Makefile | 2 +-
> >  4 files changed, 2 insertions(+), 1 deletion(-)  rename
> > drivers/{soc/fsl/qe => irqchip}/qe_ic.c (100%)  rename
> > drivers/{soc/fsl/qe => irqchip}/qe_ic.h (100%)
> 
> Please merge the include file into the C file and rename to follow the naming
> convention in drivers/irqchip/.  e.g. irq-qeic.c or irq-qe_ic.c.
> 
> Once you have that, please resend the entire series with this as the first 
> patch.

Sorry, I have no idea about "Include file", could you explain which file you 
meant?

Thank you!
-Zhao Qiang
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powernv/pci: Add PHB register dump debugfs handle

2016-07-25 Thread Tyrel Datwyler
On 07/21/2016 11:36 PM, Gavin Shan wrote:
> On Fri, Jul 22, 2016 at 03:23:36PM +1000, Russell Currey wrote:
>> On EEH events the kernel will print a dump of relevant registers.
>> If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform
>> doesn't have EEH support, etc) this information isn't readily available.
>>
>> Add a new debugfs handler to trigger a PHB register dump, so that this
>> information can be made available on demand.
>>
>> Signed-off-by: Russell Currey 
> 
> Reviewed-by: Gavin Shan 
> 
>> ---
>> arch/powerpc/platforms/powernv/pci-ioda.c | 35 
>> +++
>> 1 file changed, 35 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
>> b/arch/powerpc/platforms/powernv/pci-ioda.c
>> index 891fc4a..ada2f3c 100644
>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>> @@ -3018,6 +3018,38 @@ static void pnv_ioda_setup_pe_seg(struct pnv_ioda_pe 
>> *pe)
>>  }
>> }
>>
>> +#ifdef CONFIG_DEBUG_FS
>> +static ssize_t pnv_pci_debug_write(struct file *filp,
>> +   const char __user *user_buf,
>> +   size_t count, loff_t *ppos)
>> +{
>> +struct pci_controller *hose = filp->private_data;
>> +struct pnv_phb *phb;
>> +int ret = 0;
> 
> Needn't initialize @ret in advance. The code might be simpler, but it's
> only a personal preference:

I believe its actually preferred that it not be initialized in advance
so that the tooling can warn you about conditional code paths where you
may have forgotten to set a value. Or as Gavin suggests to explicitly
use error values in the return statements.

-Tyrel

> 
>   struct pci_controller *hose = filp->private_data;
>   struct pnv_phb *phb = hose ? hose->private_data : NULL;
> 
>   if (!phb)
>   return -ENODEV;
> 
>> +
>> +if (!hose)
>> +return -EFAULT;
>> +
>> +phb = hose->private_data;
>> +if (!phb)
>> +return -EFAULT;
>> +
>> +ret = opal_pci_get_phb_diag_data2(phb->opal_id, phb->diag.blob,
>> +  PNV_PCI_DIAG_BUF_SIZE);
>> +
>> +if (!ret)
>> +pnv_pci_dump_phb_diag_data(phb->hose, phb->diag.blob);
>> +
>> +return ret < 0 ? ret : count;
> 
> return ret == OPAL_SUCCESS ? count : -EIO;
> 
>> +}
>> +
>> +static const struct file_operations pnv_pci_debug_ops = {
>> +.open   = simple_open,
>> +.llseek = no_llseek,
>> +.write  = pnv_pci_debug_write,
> 
> It might be reasonable to dump the diag-data on read if it is trying
> to do it on write.
> 
>> +};
>> +#endif /* CONFIG_DEBUG_FS */
>> +
>> static void pnv_pci_ioda_create_dbgfs(void)
>> {
>> #ifdef CONFIG_DEBUG_FS
>> @@ -3036,6 +3068,9 @@ static void pnv_pci_ioda_create_dbgfs(void)
>>  if (!phb->dbgfs)
>>  pr_warning("%s: Error on creating debugfs on PHB#%x\n",
>>  __func__, hose->global_number);
>> +
>> +debugfs_create_file("regdump", 0200, phb->dbgfs, hose,
>> +_pci_debug_ops);
> 
> "diag-data" might be indicating or a better one you can name :)
> 
> Thanks,
> Gavin
> 
>>  }
>> #endif /* CONFIG_DEBUG_FS */
>> }
>> -- 
>> 2.9.0
>>
> 
> ___
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 00/12] mm: Hardened usercopy

2016-07-25 Thread Kees Cook
On Fri, Jul 22, 2016 at 5:36 PM, Laura Abbott  wrote:
> On 07/20/2016 01:26 PM, Kees Cook wrote:
>>
>> Hi,
>>
>> [This is now in my kspp -next tree, though I'd really love to add some
>> additional explicit Tested-bys, Reviewed-bys, or Acked-bys. If you've
>> looked through any part of this or have done any testing, please consider
>> sending an email with your "*-by:" line. :)]
>>
>> This is a start of the mainline port of PAX_USERCOPY[1]. After writing
>> tests (now in lkdtm in -next) for Casey's earlier port[2], I kept tweaking
>> things further and further until I ended up with a whole new patch series.
>> To that end, I took Rik, Laura, and other people's feedback along with
>> additional changes and clean-ups.
>>
>> Based on my understanding, PAX_USERCOPY was designed to catch a
>> few classes of flaws (mainly bad bounds checking) around the use of
>> copy_to_user()/copy_from_user(). These changes don't touch get_user() and
>> put_user(), since these operate on constant sized lengths, and tend to be
>> much less vulnerable. There are effectively three distinct protections in
>> the whole series, each of which I've given a separate CONFIG, though this
>> patch set is only the first of the three intended protections. (Generally
>> speaking, PAX_USERCOPY covers what I'm calling CONFIG_HARDENED_USERCOPY
>> (this) and CONFIG_HARDENED_USERCOPY_WHITELIST (future), and
>> PAX_USERCOPY_SLABS covers CONFIG_HARDENED_USERCOPY_SPLIT_KMALLOC
>> (future).)
>>
>> This series, which adds CONFIG_HARDENED_USERCOPY, checks that objects
>> being copied to/from userspace meet certain criteria:
>> - if address is a heap object, the size must not exceed the object's
>>   allocated size. (This will catch all kinds of heap overflow flaws.)
>> - if address range is in the current process stack, it must be within the
>>   a valid stack frame (if such checking is possible) or at least entirely
>>   within the current process's stack. (This could catch large lengths that
>>   would have extended beyond the current process stack, or overflows if
>>   their length extends back into the original stack.)
>> - if the address range is part of kernel data, rodata, or bss, allow it.
>> - if address range is page-allocated, that it doesn't span multiple
>>   allocations (excepting Reserved and CMA pages).
>> - if address is within the kernel text, reject it.
>> - everything else is accepted
>>
>> The patches in the series are:
>> - Support for examination of CMA page types:
>> 1- mm: Add is_migrate_cma_page
>> - Support for arch-specific stack frame checking (which will likely be
>>   replaced in the future by Josh's more comprehensive unwinder):
>> 2- mm: Implement stack frame object validation
>> - The core copy_to/from_user() checks, without the slab object checks:
>> 3- mm: Hardened usercopy
>> - Per-arch enablement of the protection:
>> 4- x86/uaccess: Enable hardened usercopy
>> 5- ARM: uaccess: Enable hardened usercopy
>> 6- arm64/uaccess: Enable hardened usercopy
>> 7- ia64/uaccess: Enable hardened usercopy
>> 8- powerpc/uaccess: Enable hardened usercopy
>> 9- sparc/uaccess: Enable hardened usercopy
>>10- s390/uaccess: Enable hardened usercopy
>> - The heap allocator implementation of object size checking:
>>11- mm: SLAB hardened usercopy support
>>12- mm: SLUB hardened usercopy support
>>
>> Some notes:
>>
>> - This is expected to apply on top of -next which contains fixes for the
>>   position of _etext on both arm and arm64, though it has some conflicts
>>   with KASAN that should be trivial to fix up. Also in -next are the
>>   tests for this protection (in lkdtm), prefixed with USERCOPY_.
>>
>> - I couldn't detect a measurable performance change with these features
>>   enabled. Kernel build times were unchanged, hackbench was unchanged,
>>   etc. I think we could flip this to "on by default" at some point, but
>>   for now, I'm leaving it off until I can get some more definitive
>>   measurements. I would love if someone with greater familiarity with
>>   perf could give this a spin and report results.
>>
>> - The SLOB support extracted from grsecurity seems entirely broken. I
>>   have no idea what's going on there, I spent my time testing SLAB and
>>   SLUB. Having someone else look at SLOB would be nice, but this series
>>   doesn't depend on it.
>>
>> Additional features that would be nice, but aren't blocking this series:
>>
>> - Needs more architecture support for stack frame checking (only x86 now,
>>   but it seems Josh will have a good solution for this soon).
>>
>>
>> Thanks!
>>
>> -Kees
>>
>> [1] https://grsecurity.net/download.php "grsecurity - test kernel patch"
>> [2] http://www.openwall.com/lists/kernel-hardening/2016/05/19/5
>>
>> v4:
>> - handle CMA pages, labbott
>> - update stack checker comments, labbott
>> - check for vmalloc addresses, labbott
>> - deal with KASAN in -next changing arm64 

[PATCH v3 2/2] powerpc/pseries: Implement indexed-count hotplug memory remove

2016-07-25 Thread Sahil Mehta
Indexed-count remove for memory hotplug guarantees that a contiguous block
of  lmbs beginning at a specified  will be unassigned (NOT
that  lmbs will be removed). Because of Qemu's per-DIMM memory
management, the removal of a contiguous block of memory currently
requires a series of individual calls. Indexed-count remove reduces
this series into a single call.

Signed-off-by: Sahil Mehta 
---
v2: -use u32s drc_index and count instead of u32 ic[]
 in dlpar_memory
v3: -add logic to handle invalid drc_index input

 arch/powerpc/platforms/pseries/hotplug-memory.c |   90 +++
 1 file changed, 90 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 2d4ceb3..dd5eb38 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -503,6 +503,92 @@ static int dlpar_memory_remove_by_index(u32 drc_index, 
struct property *prop)
return rc;
 }

+static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index,
+struct property *prop)
+{
+   struct of_drconf_cell *lmbs;
+   u32 num_lmbs, *p;
+   int i, rc, start_lmb_found;
+   int lmbs_available = 0, start_index = 0, end_index;
+
+   pr_info("Attempting to hot-remove %u LMB(s) at %x\n",
+   lmbs_to_remove, drc_index);
+
+   if (lmbs_to_remove == 0)
+   return -EINVAL;
+
+   p = prop->value;
+   num_lmbs = *p++;
+   lmbs = (struct of_drconf_cell *)p;
+   start_lmb_found = 0;
+
+   /* Navigate to drc_index */
+   while (start_index < num_lmbs) {
+   if (lmbs[start_index].drc_index == drc_index) {
+   start_lmb_found = 1;
+   break;
+   }
+
+   start_index++;
+   }
+
+   if (!start_lmb_found)
+   return -EINVAL;
+
+   end_index = start_index + lmbs_to_remove;
+
+   /* Validate that there are enough LMBs to satisfy the request */
+   for (i = start_index; i < end_index; i++) {
+   if (lmbs[i].flags & DRCONF_MEM_RESERVED)
+   break;
+
+   lmbs_available++;
+   }
+
+   if (lmbs_available < lmbs_to_remove)
+   return -EINVAL;
+
+   for (i = 0; i < end_index; i++) {
+   if (!(lmbs[i].flags & DRCONF_MEM_ASSIGNED))
+   continue;
+
+   rc = dlpar_remove_lmb([i]);
+   if (rc)
+   break;
+
+   lmbs[i].reserved = 1;
+   }
+
+   if (rc) {
+   pr_err("Memory indexed-count-remove failed, adding any removed 
LMBs\n");
+
+   for (i = start_index; i < end_index; i++) {
+   if (!lmbs[i].reserved)
+   continue;
+
+   rc = dlpar_add_lmb([i]);
+   if (rc)
+   pr_err("Failed to add LMB, drc index %x\n",
+  be32_to_cpu(lmbs[i].drc_index));
+
+   lmbs[i].reserved = 0;
+   }
+   rc = -EINVAL;
+   } else {
+   for (i = start_index; i < end_index; i++) {
+   if (!lmbs[i].reserved)
+   continue;
+
+   pr_info("Memory at %llx (drc index %x) was 
hot-removed\n",
+   lmbs[i].base_addr, lmbs[i].drc_index);
+
+   lmbs[i].reserved = 0;
+   }
+   }
+
+   return rc;
+}
+
 #else
 static inline int pseries_remove_memblock(unsigned long base,
  unsigned int memblock_size)
@@ -829,6 +915,10 @@ int dlpar_memory(struct pseries_hp_errorlog *hp_elog)
} else if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX) {
drc_index = hp_elog->_drc_u.drc_index;
rc = dlpar_memory_remove_by_index(drc_index, prop);
+   } else if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_IC) {
+   count = hp_elog->_drc_u.indexed_count[0];
+   drc_index = hp_elog->_drc_u.indexed_count[1];
+   rc = dlpar_memory_remove_by_ic(count, drc_index, prop);
} else {
rc = -EINVAL;
}

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 1/2] powerpc/pseries: Implement indexed-count hotplug memory add

2016-07-25 Thread Sahil Mehta
Indexed-count add for memory hotplug guarantees that a contiguous block
of  lmbs beginning at a specified  will be assigned (NOT
that  lmbs will be added). Because of Qemu's per-DIMM memory
management, the addition of a contiguous block of memory currently
requires a series of individual calls. Indexed-count add reduces
this series into a single call.

Signed-off-by: Sahil Mehta 
---
v2: -remove potential memory leak when parsing command
-use u32s drc_index and count instead of u32 ic[]
 in dlpar_memory
v3: -add logic to handle invalid drc_index input
-update indexed-count trigger to follow naming convention
-update dlpar_memory to follow kernel if-else style

 arch/powerpc/include/asm/rtas.h |2
 arch/powerpc/platforms/pseries/dlpar.c  |   34 ++-
 arch/powerpc/platforms/pseries/hotplug-memory.c |  110 +--
 3 files changed, 134 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 51400ba..088ea75 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -307,6 +307,7 @@ struct pseries_hp_errorlog {
union {
__be32  drc_index;
__be32  drc_count;
+   __be32  indexed_count[2];
chardrc_name[1];
} _drc_u;
 };
@@ -322,6 +323,7 @@ struct pseries_hp_errorlog {
 #define PSERIES_HP_ELOG_ID_DRC_NAME1
 #define PSERIES_HP_ELOG_ID_DRC_INDEX   2
 #define PSERIES_HP_ELOG_ID_DRC_COUNT   3
+#define PSERIES_HP_ELOG_ID_DRC_IC  4

 struct pseries_errorlog *get_pseries_errorlog(struct rtas_error_log *log,
  uint16_t section_id);
diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index 2b93ae8..6dbd13c 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -345,11 +345,17 @@ static int handle_dlpar_errorlog(struct 
pseries_hp_errorlog *hp_elog)
switch (hp_elog->id_type) {
case PSERIES_HP_ELOG_ID_DRC_COUNT:
hp_elog->_drc_u.drc_count =
-   be32_to_cpu(hp_elog->_drc_u.drc_count);
+   be32_to_cpu(hp_elog->_drc_u.drc_count);
break;
case PSERIES_HP_ELOG_ID_DRC_INDEX:
hp_elog->_drc_u.drc_index =
-   be32_to_cpu(hp_elog->_drc_u.drc_index);
+   be32_to_cpu(hp_elog->_drc_u.drc_index);
+   break;
+   case PSERIES_HP_ELOG_ID_DRC_IC:
+   hp_elog->_drc_u.indexed_count[0] =
+   be32_to_cpu(hp_elog->_drc_u.indexed_count[0]);
+   hp_elog->_drc_u.indexed_count[1] =
+   be32_to_cpu(hp_elog->_drc_u.indexed_count[1]);
}

switch (hp_elog->resource) {
@@ -409,7 +415,29 @@ static ssize_t dlpar_store(struct class *class, struct 
class_attribute *attr,
goto dlpar_store_out;
}

-   if (!strncmp(arg, "index", 5)) {
+   if (!strncmp(arg, "indexed-count", 13)) {
+   u32 index, count;
+   char *cstr, *istr;
+
+   hp_elog->id_type = PSERIES_HP_ELOG_ID_DRC_IC;
+   arg += strlen("indexed-count ");
+
+   cstr = kstrdup(arg, GFP_KERNEL);
+   istr = strchr(cstr, ' ');
+   *istr++ = '\0';
+
+   if (kstrtou32(cstr, 0, ) || kstrtou32(istr, 0, )) {
+   rc = -EINVAL;
+   pr_err("Invalid index or count : \"%s\"\n", buf);
+   kfree(cstr);
+   goto dlpar_store_out;
+   }
+
+   kfree(cstr);
+
+   hp_elog->_drc_u.indexed_count[0] = cpu_to_be32(count);
+   hp_elog->_drc_u.indexed_count[1] = cpu_to_be32(index);
+   } else if (!strncmp(arg, "index", 5)) {
u32 index;

hp_elog->id_type = PSERIES_HP_ELOG_ID_DRC_INDEX;
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 2ce1385..2d4ceb3 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -701,6 +701,89 @@ static int dlpar_memory_add_by_index(u32 drc_index, struct 
property *prop)
return rc;
 }

+static int dlpar_memory_add_by_ic(u32 lmbs_to_add, u32 drc_index,
+ struct property *prop)
+{
+   struct of_drconf_cell *lmbs;
+   u32 num_lmbs, *p;
+   int i, rc, start_lmb_found;
+   int lmbs_available = 0, start_index = 0, end_index;
+
+   pr_info("Attempting to hot-add %u LMB(s) at index %x\n",
+   lmbs_to_add, drc_index);
+
+   if (lmbs_to_add == 0)
+   return -EINVAL;
+
+   p = prop->value;
+   

[PATCH v3 0/2] powerpc/pseries: Implement indexed-count hotplug memory management

2016-07-25 Thread Sahil Mehta
Indexed-count memory management allows addition and removal of contiguous
lmb blocks with a single command. When compared to the series of calls
previously required to manage contiguous blocks, indexed-count decreases
command frequency and reduces risk of buffer overflow.

Changes in v2:
--
-[PATCH 1/2]:   -remove potential memory leak when parsing command
-use u32s drc_index and count instead of u32 ic[]
 in dlpar_memory
-[PATCH 2/2]:   -use u32s drc_index and count instead of u32 ic[]
 in dlpar_memory

Changes in v3:
--
-[PATCH 1/2]:   -add logic to handle invalid drc_index input
-update indexed-count trigger to follow naming convention
-update dlpar_memory to follow kernel if-else style
-[PATCH 2/2]:   -add logic to handle invalid drc_index input

-Sahil Mehta
---
 include/asm/rtas.h |2
 platforms/pseries/dlpar.c  |   34 +-
 platforms/pseries/hotplug-memory.c |  200 +++--
 3 files changed, 224 insertions(+), 12 deletions(-)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH 9/9] powerpc: rewrite local_t using soft_irq

2016-07-25 Thread Madhavan Srinivasan
Local atomic operations are fast and highly reentrant per CPU counters.
Used for percpu variable updates. Local atomic operations only guarantee
variable modification atomicity wrt the CPU which owns the data and
these needs to be executed in a preemption safe way.

Here is the design of this patch. Since local_* operations
are only need to be atomic to interrupts (IIUC), we have two options.
Either replay the "op" if interrupted or replay the interrupt after
the "op". Initial patchset posted was based on implementing local_* operation
based on CR5 which replay's the "op". Patchset had issues in case of
rewinding the address pointor from an array. This make the slow patch
really slow. Since CR5 based implementation proposed using __ex_table to find
the rewind addressr, this rasied concerns about size of __ex_table and vmlinux.

https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-December/123115.html

But this patch uses Benjamin Herrenschmidt suggestion of 
arch_local_irq_disable_var() to soft_disable
interrupts (including PMIs). After finishing the "op", arch_local_irq_restore()
called and correspondingly interrupts are replayed if any occured.

patch re-write the current local_* functions to use arch_local_irq_disbale.
Base flow for each function is

 {
arch_local_irq_disable_var(2)
load
..
store
arch_local_irq_restore()
 }

Currently only asm/local.h has been rewrite, and also
the entire change is tested only in PPC64 (pseries guest)

Reason for the approach is that, currently l[w/d]arx/st[w/d]cx.
instruction pair is used for local_* operations, which are heavy
on cycle count and they dont support a local variant. So to
see whether the new implementation helps, used a modified
version of Rusty's benchmark code on local_t.

https://lkml.org/lkml/2008/12/16/450

Modifications to Rusty's benchmark code:
 - Executed only local_t test

Here are the values with the patch.

Time in ns per iteration

Local_t Without Patch   With Patch

_inc28  8
_add28  8
_read   3   3
_add_return 28  7

Tested the patch in a
 - pSeries LPAR (with perf record)

Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/include/asm/local.h | 91 +++-
 1 file changed, 63 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/local.h b/arch/powerpc/include/asm/local.h
index b8da91363864..afd3dabd92cb 100644
--- a/arch/powerpc/include/asm/local.h
+++ b/arch/powerpc/include/asm/local.h
@@ -14,24 +14,50 @@ typedef struct
 #define local_read(l)  atomic_long_read(&(l)->a)
 #define local_set(l,i) atomic_long_set(&(l)->a, (i))
 
-#define local_add(i,l) atomic_long_add((i),(&(l)->a))
-#define local_sub(i,l) atomic_long_sub((i),(&(l)->a))
-#define local_inc(l)   atomic_long_inc(&(l)->a)
-#define local_dec(l)   atomic_long_dec(&(l)->a)
+static __inline__ void local_add(long i, local_t *l)
+{
+   long t;
+   unsigned long flags;
+
+   flags = arch_local_irq_disable_var(2);
+   __asm__ __volatile__(
+   PPC_LL" %0,0(%2)\n\
+   add %0,%1,%0\n"
+   PPC_STL" %0,0(%2)\n"
+   : "=" (t)
+   : "r" (i), "r" (&(l->a.counter)));
+   arch_local_irq_restore(flags);
+}
+
+static __inline__ void local_sub(long i, local_t *l)
+{
+   long t;
+   unsigned long flags;
+
+   flags = arch_local_irq_disable_var(2);
+   __asm__ __volatile__(
+   PPC_LL" %0,0(%2)\n\
+   subf%0,%1,%0\n"
+   PPC_STL" %0,0(%2)\n"
+   : "=" (t)
+   : "r" (i), "r" (&(l->a.counter)));
+   arch_local_irq_restore(flags);
+}
 
 static __inline__ long local_add_return(long a, local_t *l)
 {
long t;
+   unsigned long flags;
 
+   flags = arch_local_irq_disable_var(2);
__asm__ __volatile__(
-"1:"   PPC_LLARX(%0,0,%2,0) "  # local_add_return\n\
+   PPC_LL" %0,0(%2)\n\
add %0,%1,%0\n"
-   PPC405_ERR77(0,%2)
-   PPC_STLCX   "%0,0,%2 \n\
-   bne-1b"
+   PPC_STL "%0,0(%2)\n"
: "=" (t)
: "r" (a), "r" (&(l->a.counter))
: "cc", "memory");
+   arch_local_irq_restore(flags);
 
return t;
 }
@@ -41,16 +67,18 @@ static __inline__ long local_add_return(long a, local_t *l)
 static __inline__ long local_sub_return(long a, local_t *l)
 {
long t;
+   unsigned long flags;
+
+   flags = arch_local_irq_disable_var(2);
 
__asm__ __volatile__(
-"1:"   PPC_LLARX(%0,0,%2,0) "  # local_sub_return\n\
+"1:"   PPC_LL" %0,0(%2)\n\
subf%0,%1,%0\n"
-   PPC405_ERR77(0,%2)
-   PPC_STLCX   "%0,0,%2 \n\
-   bne-1b"
+   PPC_STL "%0,0(%2)\n"
: "=" (t)
: "r" (a), "r" (&(l->a.counter))
: "cc", "memory");
+   arch_local_irq_restore(flags);
 
return t;
 }
@@ -58,16 +86,17 @@ static 

[RFC PATCH 8/9] powerpc: Support to replay PMIs

2016-07-25 Thread Madhavan Srinivasan
Code to replay the Performance Monitoring Interrupts(PMI).
In the masked_interrupt handler, for PMIs we reset the MSR[EE]
and return. This is due the fact that PMIs are level triggered.
In the __check_irq_replay(), we enabled the MSR[EE] which will
fire the interrupt for us.

Patch also adds a new arch_local_irq_disable_var() variant. New
variant takes an input value to write to the paca->soft_enabled.
This will be used in following patch to implement the tri-state
value for soft-enabled.

Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/include/asm/hw_irq.h | 14 ++
 arch/powerpc/kernel/irq.c |  9 -
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/hw_irq.h 
b/arch/powerpc/include/asm/hw_irq.h
index cc69dde6eb84..863179654452 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -81,6 +81,20 @@ static inline unsigned long arch_local_irq_disable(void)
return flags;
 }
 
+static inline unsigned long arch_local_irq_disable_var(int value)
+{
+   unsigned long flags, zero;
+
+   asm volatile(
+   "li %1,%3; lbz %0,%2(13); stb %1,%2(13)"
+   : "=r" (flags), "=" (zero)
+   : "i" (offsetof(struct paca_struct, soft_enabled)),\
+ "i" (value)
+   : "memory");
+
+   return flags;
+}
+
 extern void arch_local_irq_restore(unsigned long);
 
 static inline void arch_local_irq_enable(void)
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 597c20d1814c..81fe0da1f86d 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -158,9 +158,16 @@ notrace unsigned int __check_irq_replay(void)
if ((happened & PACA_IRQ_DEC) || decrementer_check_overflow())
return 0x900;
 
+   /*
+* In masked_handler() for PMI, we disable MSR[EE] and return.
+* When replaying it, just enabling the MSR[EE] will do
+* trick, since the PMI are "level" triggered.
+*/
+   local_paca->irq_happened &= ~PACA_IRQ_PMI;
+
/* Finally check if an external interrupt happened */
local_paca->irq_happened &= ~PACA_IRQ_EE;
-   if (happened & PACA_IRQ_EE)
+   if ((happened & PACA_IRQ_EE) || (happened & PACA_IRQ_PMI))
return 0x500;
 
 #ifdef CONFIG_PPC_BOOK3E
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH 7/9] powerpc: Add support to mask perf interrupts

2016-07-25 Thread Madhavan Srinivasan
To support masking of the PMI interrupts, couple of new interrupt handler
macros are added MASKABLE_EXCEPTION_PSERIES_OOL and
MASKABLE_RELON_EXCEPTION_PSERIES_OOL. These are needed to include the
SOFTEN_TEST and implement the support at both host and guest kernel.

Couple of new irq #defs "PACA_IRQ_PMI" and "SOFTEN_VALUE_0xf0*" added to
use in the exception code to check for PMI interrupts.

__SOFTEN_TEST macro is modified to support the PMI interrupt.
Present __SOFTEN_TEST code loads the soft_enabled from paca and check to call
masked_interrupt handler code. To support both current behaviour and PMI
masking, these changes are added,

1) Current LR register content are saved in R11
2) "bge" branch operation is changed to "bgel".
3) restore R11 to LR

Reason:

To retain PMI as NMI behaviour for flag state of 1, we save the LR regsiter
value in R11 and branch to "masked_interrupt" handler with LR update. And in
"masked_interrupt" handler, we check for the "SOFTEN_VALUE_*" value in R10
for PMI and branch back with "blr" if PMI.

To mask PMI for a flag >1 value, masked_interrupt vaoid's the above check
and continue to execute the masked_interrupt code and disabled MSR[EE] and
updated the irq_happend with PMI info.

Finally, saving of R11 is moved before calling SOFTEN_TEST in the
__EXCEPTION_PROLOG_1 macro to support saving of LR values in SOFTEN_TEST.

Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/include/asm/exception-64s.h | 22 --
 arch/powerpc/include/asm/hw_irq.h|  1 +
 arch/powerpc/kernel/exceptions-64s.S | 27 ---
 3 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 44d3f539d8a5..c951b7ab5108 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -166,8 +166,8 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
OPT_SAVE_REG_TO_PACA(area+EX_CFAR, r10, CPU_FTR_CFAR);  \
SAVE_CTR(r10, area);\
mfcrr9; \
-   extra(vec); \
std r11,area+EX_R11(r13);   \
+   extra(vec); \
std r12,area+EX_R12(r13);   \
GET_SCRATCH0(r10);  \
std r10,area+EX_R13(r13)
@@ -403,12 +403,17 @@ label##_relon_hv: 
\
 #define SOFTEN_VALUE_0xe82 PACA_IRQ_DBELL
 #define SOFTEN_VALUE_0xe60 PACA_IRQ_HMI
 #define SOFTEN_VALUE_0xe62 PACA_IRQ_HMI
+#define SOFTEN_VALUE_0xf01 PACA_IRQ_PMI
+#define SOFTEN_VALUE_0xf00 PACA_IRQ_PMI
 
 #define __SOFTEN_TEST(h, vec)  \
lbz r10,PACASOFTIRQEN(r13); \
cmpwi   r10,LAZY_INTERRUPT_DISABLED;\
li  r10,SOFTEN_VALUE_##vec; \
-   bge masked_##h##interrupt
+   mflrr11;\
+   bgelmasked_##h##interrupt;  \
+   mtlrr11;
+
 #define _SOFTEN_TEST(h, vec)   __SOFTEN_TEST(h, vec)
 
 #define SOFTEN_TEST_PR(vec)\
@@ -438,6 +443,12 @@ label##_pSeries:   
\
_MASKABLE_EXCEPTION_PSERIES(vec, label, \
EXC_STD, SOFTEN_TEST_PR)
 
+#define MASKABLE_EXCEPTION_PSERIES_OOL(vec, label) \
+   .globl label##_pSeries; \
+label##_pSeries:   \
+   EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, vec);\
+   EXCEPTION_PROLOG_PSERIES_1(label##_common, EXC_STD);
+
 #define MASKABLE_EXCEPTION_HV(loc, vec, label) \
. = loc;\
.globl label##_hv;  \
@@ -466,6 +477,13 @@ label##_relon_pSeries: 
\
_MASKABLE_RELON_EXCEPTION_PSERIES(vec, label,   \
  EXC_STD, SOFTEN_NOTEST_PR)
 
+#define MASKABLE_RELON_EXCEPTION_PSERIES_OOL(vec, label)   \
+   .globl label##_relon_pSeries;   \
+label##_relon_pSeries: \
+   EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_NOTEST_PR, vec);  \
+   EXCEPTION_PROLOG_PSERIES_1(label##_common, EXC_STD);
+
+
 #define 

[RFC PATCH 6/9] powerpc: modify __SOFTEN_TEST to support tri-state soft_enabled flag

2016-07-25 Thread Madhavan Srinivasan
Foundation patch to support checking of new flag for "paca->soft_enabled".
Modify the condition checking for the "soft_enabled" from "equal" to
"greater than or equal to".

Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/include/asm/exception-64s.h | 2 +-
 arch/powerpc/include/asm/hw_irq.h| 4 ++--
 arch/powerpc/include/asm/irqflags.h  | 2 +-
 arch/powerpc/kernel/entry_64.S   | 4 ++--
 arch/powerpc/kernel/irq.c| 4 ++--
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index e24e63d216c4..44d3f539d8a5 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -408,7 +408,7 @@ label##_relon_hv:   
\
lbz r10,PACASOFTIRQEN(r13); \
cmpwi   r10,LAZY_INTERRUPT_DISABLED;\
li  r10,SOFTEN_VALUE_##vec; \
-   beq masked_##h##interrupt
+   bge masked_##h##interrupt
 #define _SOFTEN_TEST(h, vec)   __SOFTEN_TEST(h, vec)
 
 #define SOFTEN_TEST_PR(vec)\
diff --git a/arch/powerpc/include/asm/hw_irq.h 
b/arch/powerpc/include/asm/hw_irq.h
index 2b87930e0e82..b7c7f1c6706f 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -94,7 +94,7 @@ static inline unsigned long arch_local_irq_save(void)
 
 static inline bool arch_irqs_disabled_flags(unsigned long flags)
 {
-   return flags == LAZY_INTERRUPT_DISABLED;
+   return flags >= LAZY_INTERRUPT_DISABLED;
 }
 
 static inline bool arch_irqs_disabled(void)
@@ -139,7 +139,7 @@ static inline void may_hard_irq_enable(void)
 
 static inline bool arch_irq_disabled_regs(struct pt_regs *regs)
 {
-   return (regs->softe == LAZY_INTERRUPT_DISABLED);
+   return (regs->softe >= LAZY_INTERRUPT_DISABLED);
 }
 
 extern bool prep_irq_for_idle(void);
diff --git a/arch/powerpc/include/asm/irqflags.h 
b/arch/powerpc/include/asm/irqflags.h
index 6091e46f2455..235055fabf65 100644
--- a/arch/powerpc/include/asm/irqflags.h
+++ b/arch/powerpc/include/asm/irqflags.h
@@ -52,7 +52,7 @@
li  __rA,LAZY_INTERRUPT_DISABLED;   \
ori __rB,__rB,PACA_IRQ_HARD_DIS;\
stb __rB,PACAIRQHAPPENED(r13);  \
-   beq 44f;\
+   bge 44f;\
stb __rA,PACASOFTIRQEN(r13);\
TRACE_DISABLE_INTS; \
 44:
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index cade169a7517..7ab6bfff653e 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -766,7 +766,7 @@ restore:
ld  r5,SOFTE(r1)
lbz r6,PACASOFTIRQEN(r13)
cmpwi   cr0,r5,LAZY_INTERRUPT_DISABLED
-   beq restore_irq_off
+   bge restore_irq_off
 
/* We are enabling, were we already enabled ? Yes, just return */
cmpwi   cr0,r6,LAZY_INTERRUPT_ENABLED
@@ -1012,7 +1012,7 @@ _GLOBAL(enter_rtas)
 * check it with the asm equivalent of WARN_ON
 */
lbz r0,PACASOFTIRQEN(r13)
-1: tdnei   r0,LAZY_INTERRUPT_DISABLED
+1: tdeqi   r0,LAZY_INTERRUPT_ENABLED
EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING
 #endif

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 9b9b6df8d83d..597c20d1814c 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -202,7 +202,7 @@ notrace void arch_local_irq_restore(unsigned long en)
 
/* Write the new soft-enabled value */
set_soft_enabled(en);
-   if (en == LAZY_INTERRUPT_DISABLED)
+   if (en >= LAZY_INTERRUPT_DISABLED)
return;
/*
 * From this point onward, we can take interrupts, preempt,
@@ -247,7 +247,7 @@ notrace void arch_local_irq_restore(unsigned long en)
}
 #endif /* CONFIG_TRACE_IRQFLAG */
 
-   set_soft_enabled(LAZY_INTERRUPT_DISABLED);
+   set_soft_enabled(en);
 
/*
 * Check if anything needs to be re-emitted. We haven't
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH 5/9] powerpc: reverse the soft_enable logic

2016-07-25 Thread Madhavan Srinivasan
"paca->soft_enabled" is used as a flag to mask some of interrupts.
Currently supported flags values and their details:

soft_enabledMSR[EE]

0   0   Disabled (PMI and HMI not masked)
1   1   Enabled

"paca->soft_enabled" is initialed to 1 to make the interripts as
enabled. arch_local_irq_disable() will toggle the value when interrupts
needs to disbled. At this point, the interrupts are not actually disabled,
instead, interrupt vector has code to check for the flag and mask it when it 
occurs.
By "mask it", it updated interrupt paca->irq_happened and return.
arch_local_irq_restore() is called to re-enable interrupts, which checks and
replays interrupts if any occured.

Now, as mentioned, current logic doesnot mask "performance monitoring 
interrupts"
and PMIs are implemented as NMI. But this patchset depends on local_irq_*
for a successful local_* update. Meaning, mask all possible interrupts during
local_* update and replay them after the update.

So the idea here is to reserve the "paca->soft_enabled" logic. New values and
details:

soft_enabledMSR[EE]

1   0   Disabled  (PMI and HMI not masked)
0   1   Enabled

Reason for the this change is to create foundation for a third flag value "2"
for "soft_enabled" to add support to mask PMIs. When arch_irq_disable_* is
called with a value "2", PMI interrupts are mask. But when called with a value
of "1", PMI are not mask.

With new flag value for "soft_enabled", states looks like:

soft_enabledMSR[EE]

2   0   Disbaled PMIs also
1   0   Disabled  (PMI and HMI not masked)
0   1   Enabled

And interrupt handler code for checking has been modified to check for
for "greater than or equal" to 1 condition instead.

Comment here explains the logic changes that are implemented in the following
patches. But this patch primarly does only reserve the logic. Following patches
will make the corresponding changes.

Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/include/asm/hw_irq.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/hw_irq.h 
b/arch/powerpc/include/asm/hw_irq.h
index 09491417fbf7..2b87930e0e82 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -30,8 +30,8 @@
 /*
  * flags for paca->soft_enabled
  */
-#define LAZY_INTERRUPT_ENABLED 1
-#define LAZY_INTERRUPT_DISABLED0
+#define LAZY_INTERRUPT_ENABLED 0
+#define LAZY_INTERRUPT_DISABLED1
 
 
 #endif /* CONFIG_PPC64 */
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH 4/9] powerpc: Use set_soft_enabled api to update paca->soft_enabled

2016-07-25 Thread Madhavan Srinivasan
Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/include/asm/kvm_ppc.h | 2 +-
 arch/powerpc/kernel/irq.c  | 4 ++--
 arch/powerpc/kernel/setup_64.c | 3 ++-
 arch/powerpc/kernel/time.c | 4 ++--
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index e790b8a6bf0b..68c2275c3674 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -707,7 +707,7 @@ static inline void kvmppc_fix_ee_before_entry(void)
 
/* Only need to enable IRQs by hard enabling them after this */
local_paca->irq_happened = 0;
-   local_paca->soft_enabled = LAZY_INTERRUPT_ENABLED;
+   set_soft_enabled(LAZY_INTERRUPT_ENABLED);
 #endif
 }
 
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 88e541daf7b0..9b9b6df8d83d 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -202,7 +202,7 @@ notrace void arch_local_irq_restore(unsigned long en)
 
/* Write the new soft-enabled value */
set_soft_enabled(en);
-   if (en == LAZY_INTERRUPT_DIABLED)
+   if (en == LAZY_INTERRUPT_DISABLED)
return;
/*
 * From this point onward, we can take interrupts, preempt,
@@ -331,7 +331,7 @@ bool prep_irq_for_idle(void)
 * of entering the low power state.
 */
local_paca->irq_happened &= ~PACA_IRQ_HARD_DIS;
-   local_paca->soft_enabled = LAZY_INTERRUPT_ENABLED;
+   set_soft_enabled(LAZY_INTERRUPT_ENABLED);
 
/* Tell the caller to enter the low power state */
return true;
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 0ca504839550..2c7f4b23359a 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -206,7 +206,7 @@ static void fixup_boot_paca(void)
/* Allow percpu accesses to work until we setup percpu data */
get_paca()->data_offset = 0;
/* Mark interrupts disabled in PACA */
-   get_paca()->soft_enabled = LAZY_INTERRUPT_DISABLED;
+   set_soft_enabled(LAZY_INTERRUPT_DISABLED);
 }
 
 static void cpu_ready_for_interrupts(void)
@@ -326,6 +326,7 @@ void early_setup_secondary(void)
 {
/* Mark interrupts enabled in PACA */
get_paca()->soft_enabled = 0;
+   set_soft_enabled(LAZY_INTERRUPT_DISABLED);
 
/* Initialize the hash table or TLB handling */
early_init_mmu_secondary();
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index e46f7ab6cbde..0a1669708a0d 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -258,7 +258,7 @@ void accumulate_stolen_time(void)
 * needs to reflect that so various debug stuff doesn't
 * complain
 */
-   local_paca->soft_enabled = LAZY_INTERRUPT_DISABLED;
+   set_soft_enabled(LAZY_INTERRUPT_DISABLED);
 
sst = scan_dispatch_log(local_paca->starttime_user);
ust = scan_dispatch_log(local_paca->starttime);
@@ -266,7 +266,7 @@ void accumulate_stolen_time(void)
local_paca->user_time -= ust;
local_paca->stolen_time += ust + sst;
 
-   local_paca->soft_enabled = save_soft_enabled;
+   set_soft_enabled(save_soft_enabled);
 }
 
 static inline u64 calculate_stolen_time(u64 stop_tb)
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH 3/9] powerpc: move set_soft_enabled()

2016-07-25 Thread Madhavan Srinivasan
Move set_soft_enabled() from powerpc/kernel/irq.c to
asm/hw_irq.c. this way updation of paca->soft_enabled
can be forced wherever possible.

Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/include/asm/hw_irq.h | 6 ++
 arch/powerpc/kernel/irq.c | 6 --
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/hw_irq.h 
b/arch/powerpc/include/asm/hw_irq.h
index 433fe60cf428..09491417fbf7 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -48,6 +48,12 @@ extern void unknown_exception(struct pt_regs *regs);
 #ifdef CONFIG_PPC64
 #include 
 
+static inline notrace void set_soft_enabled(unsigned long enable)
+{
+   __asm__ __volatile__("stb %0,%1(13)"
+   : : "r" (enable), "i" (offsetof(struct paca_struct, soft_enabled)));
+}
+
 static inline unsigned long arch_local_save_flags(void)
 {
unsigned long flags;
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 06dff620fcdc..88e541daf7b0 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -106,12 +106,6 @@ static inline notrace unsigned long get_irq_happened(void)
return happened;
 }
 
-static inline notrace void set_soft_enabled(unsigned long enable)
-{
-   __asm__ __volatile__("stb %0,%1(13)"
-   : : "r" (enable), "i" (offsetof(struct paca_struct, soft_enabled)));
-}
-
 static inline notrace int decrementer_check_overflow(void)
 {
u64 now = get_tb_or_rtc();
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH 1/9] Add #defs for paca->soft_enabled flags

2016-07-25 Thread Madhavan Srinivasan
Two #defs LAZY_INTERRUPT_ENABLED and
LAZY_INTERRUPT_DISABLED are added to be used
when updating paca->soft_enabled.

Signed-off-by: Madhavan Srinivasan 
---
-If the macro names looks not right, kindly suggest

 arch/powerpc/include/asm/hw_irq.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/hw_irq.h 
b/arch/powerpc/include/asm/hw_irq.h
index b59ac27a6b7d..e58c9d95050a 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -27,6 +27,13 @@
 #define PACA_IRQ_EE_EDGE   0x10 /* BookE only */
 #define PACA_IRQ_HMI   0x20
 
+/*
+ * flags for paca->soft_enabled
+ */
+#define LAZY_INTERRUPT_ENABLED 1
+#define LAZY_INTERRUPT_DISABLED0
+
+
 #endif /* CONFIG_PPC64 */
 
 #ifndef __ASSEMBLY__
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[RFC PATCH 2/9] Cleanup to use LAZY_INTERRUPT_* macros for paca->soft_enabled update

2016-07-25 Thread Madhavan Srinivasan
Replace the hardcoded values used when updating
paca->soft_enabled with LAZY_INTERRUPT_* #def.
No logic change.

Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/include/asm/exception-64s.h |  2 +-
 arch/powerpc/include/asm/hw_irq.h| 15 ---
 arch/powerpc/include/asm/irqflags.h  |  6 +++---
 arch/powerpc/include/asm/kvm_ppc.h   |  2 +-
 arch/powerpc/kernel/entry_64.S   | 14 +++---
 arch/powerpc/kernel/head_64.S|  3 ++-
 arch/powerpc/kernel/idle_power4.S|  3 ++-
 arch/powerpc/kernel/irq.c|  9 +
 arch/powerpc/kernel/process.c|  3 ++-
 arch/powerpc/kernel/setup_64.c   |  3 +++
 arch/powerpc/kernel/time.c   |  2 +-
 arch/powerpc/mm/hugetlbpage.c|  2 +-
 arch/powerpc/perf/core-book3s.c  |  2 +-
 13 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 93ae809fe5ea..e24e63d216c4 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -406,7 +406,7 @@ label##_relon_hv:   
\
 
 #define __SOFTEN_TEST(h, vec)  \
lbz r10,PACASOFTIRQEN(r13); \
-   cmpwi   r10,0;  \
+   cmpwi   r10,LAZY_INTERRUPT_DISABLED;\
li  r10,SOFTEN_VALUE_##vec; \
beq masked_##h##interrupt
 #define _SOFTEN_TEST(h, vec)   __SOFTEN_TEST(h, vec)
diff --git a/arch/powerpc/include/asm/hw_irq.h 
b/arch/powerpc/include/asm/hw_irq.h
index e58c9d95050a..433fe60cf428 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -65,9 +65,10 @@ static inline unsigned long arch_local_irq_disable(void)
unsigned long flags, zero;
 
asm volatile(
-   "li %1,0; lbz %0,%2(13); stb %1,%2(13)"
+   "li %1,%3; lbz %0,%2(13); stb %1,%2(13)"
: "=r" (flags), "=" (zero)
-   : "i" (offsetof(struct paca_struct, soft_enabled))
+   : "i" (offsetof(struct paca_struct, soft_enabled)),\
+ "i" (LAZY_INTERRUPT_DISABLED)
: "memory");
 
return flags;
@@ -77,7 +78,7 @@ extern void arch_local_irq_restore(unsigned long);
 
 static inline void arch_local_irq_enable(void)
 {
-   arch_local_irq_restore(1);
+   arch_local_irq_restore(LAZY_INTERRUPT_ENABLED);
 }
 
 static inline unsigned long arch_local_irq_save(void)
@@ -87,7 +88,7 @@ static inline unsigned long arch_local_irq_save(void)
 
 static inline bool arch_irqs_disabled_flags(unsigned long flags)
 {
-   return flags == 0;
+   return flags == LAZY_INTERRUPT_DISABLED;
 }
 
 static inline bool arch_irqs_disabled(void)
@@ -107,9 +108,9 @@ static inline bool arch_irqs_disabled(void)
u8 _was_enabled;\
__hard_irq_disable();   \
_was_enabled = local_paca->soft_enabled;\
-   local_paca->soft_enabled = 0;   \
+   local_paca->soft_enabled = LAZY_INTERRUPT_DISABLED;\
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;  \
-   if (_was_enabled)   \
+   if (_was_enabled == LAZY_INTERRUPT_ENABLED) \
trace_hardirqs_off();   \
 } while(0)
 
@@ -132,7 +133,7 @@ static inline void may_hard_irq_enable(void)
 
 static inline bool arch_irq_disabled_regs(struct pt_regs *regs)
 {
-   return !regs->softe;
+   return (regs->softe == LAZY_INTERRUPT_DISABLED);
 }
 
 extern bool prep_irq_for_idle(void);
diff --git a/arch/powerpc/include/asm/irqflags.h 
b/arch/powerpc/include/asm/irqflags.h
index f2149066fe5d..6091e46f2455 100644
--- a/arch/powerpc/include/asm/irqflags.h
+++ b/arch/powerpc/include/asm/irqflags.h
@@ -48,8 +48,8 @@
 #define RECONCILE_IRQ_STATE(__rA, __rB)\
lbz __rA,PACASOFTIRQEN(r13);\
lbz __rB,PACAIRQHAPPENED(r13);  \
-   cmpwi   cr0,__rA,0; \
-   li  __rA,0; \
+   cmpwi   cr0,__rA,LAZY_INTERRUPT_DISABLED;\
+   li  __rA,LAZY_INTERRUPT_DISABLED;   \
ori __rB,__rB,PACA_IRQ_HARD_DIS;\
stb __rB,PACAIRQHAPPENED(r13);  \
beq 44f;\
@@ -63,7 +63,7 @@
 
 #define RECONCILE_IRQ_STATE(__rA, __rB)\
lbz __rA,PACAIRQHAPPENED(r13);  \
-   li  __rB,0; \
+   li  __rB,LAZY_INTERRUPT_DISABLED;   \
ori __rA,__rA,PACA_IRQ_HARD_DIS;\
stb __rB,PACASOFTIRQEN(r13);\
stb __rA,PACAIRQHAPPENED(r13)
diff --git 

[RFC PATCH 0/9]powerpc: "paca->soft_enabled" based local atomic operation implementation

2016-07-25 Thread Madhavan Srinivasan
Local atomic operations are fast and highly reentrant per CPU counters.
Used for percpu variable updates. Local atomic operations only guarantee
variable modification atomicity wrt the CPU which owns the data and
these needs to be executed in a preemption safe way.

Here is the design of the patchset. Since local_* operations
are only need to be atomic to interrupts (IIUC), we have two options.
Either replay the "op" if interrupted or replay the interrupt after
the "op". Initial patchset posted was based on implementing local_* operation
based on CR5 which replay's the "op". Patchset had issues in case of
rewinding the address pointor from an array. This make the slow patch
really slow. Since CR5 based implementation proposed using __ex_table to find
the rewind addressr, this rasied concerns about size of __ex_table and vmlinux.

https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-December/123115.html

But this patchset uses Benjamin Herrenschmidt suggestion of using
arch_local_irq_disable_var() to soft_disable interrupts (including PMIs).
After finishing the "op", arch_local_irq_restore() called and correspondingly
interrupts are replayed if any occured.

patch re-write the current local_* functions to use arch_local_irq_disbale.
Base flow for each function is

 {
arch_local_irq_disable_var(2)
load
..
store
arch_local_irq_restore()
 }

Currently only asm/local.h has been rewrite, and also
the entire change is tested only in PPC64 (pseries guest)

Reason for the approach is that, currently l[w/d]arx/st[w/d]cx.
instruction pair is used for local_* operations, which are heavy
on cycle count and they dont support a local variant. So to
see whether the new implementation helps, used a modified
version of Rusty's benchmark code on local_t.

https://lkml.org/lkml/2008/12/16/450

Modifications to Rusty's benchmark code:
 - Executed only local_t test

Here are the values with the patch.

Time in ns per iteration

Local_t Without Patch   With Patch

_inc28  8
_add28  8
_read   3   3
_add_return 28  7

First four are the clean up patches which lays the foundation
to make things easier. Fifth patch in the patchset reverse the
current soft_enabled logic and commit message details the reason and
need for this change. Sixth patch holds the changes needed
for reversing logic. Rest of the patches are to add support
for maskable PMI and implementation of local_t using arch_local_disable_*().

Since the patchset is experimental, changes made are focused on pseries and
powernv platforms only. Would really like to know comments for
this approach before extending to other powerpc platforms.

Tested the patchset in a
 - pSeries LPAR (with perf record).
- Ran kernbench with perf record for 24 hours.
- More testing needed.

Signed-off-by: Madhavan Srinivasan 

Madhavan Srinivasan (9):
  Add #defs for paca->soft_enabled flags
  Cleanup to use LAZY_INTERRUPT_* macros for paca->soft_enabled update
  powerpc: move set_soft_enabled()
  powerpc: Use set_soft_enabled api to update paca->soft_enabled
  powerpc: reverse the soft_enable logic
  powerpc: modify __SOFTEN_TEST to support tri-state soft_enabled flag
  powerpc: Add support to mask perf interrupts
  powerpc: Support to replay PMIs
  powerpc: rewrite local_t using soft_irq

 arch/powerpc/include/asm/exception-64s.h | 24 +++--
 arch/powerpc/include/asm/hw_irq.h| 43 ---
 arch/powerpc/include/asm/irqflags.h  |  8 +--
 arch/powerpc/include/asm/kvm_ppc.h   |  2 +-
 arch/powerpc/include/asm/local.h | 91 ++--
 arch/powerpc/kernel/entry_64.S   | 16 +++---
 arch/powerpc/kernel/exceptions-64s.S | 27 --
 arch/powerpc/kernel/head_64.S|  3 +-
 arch/powerpc/kernel/idle_power4.S|  3 +-
 arch/powerpc/kernel/irq.c| 24 +
 arch/powerpc/kernel/process.c|  3 +-
 arch/powerpc/kernel/setup_64.c   |  4 ++
 arch/powerpc/kernel/time.c   |  4 +-
 arch/powerpc/mm/hugetlbpage.c|  2 +-
 arch/powerpc/perf/core-book3s.c  |  2 +-
 15 files changed, 184 insertions(+), 72 deletions(-)

--
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH v11 4/5] powerpc/fsl: move mpc85xx.h to include/linux/fsl

2016-07-25 Thread Yangbo Lu
Hi Scott,


> -Original Message-
> From: Scott Wood [mailto:o...@buserror.net]
> Sent: Friday, July 22, 2016 12:45 AM
> To: Michael Ellerman; Arnd Bergmann
> Cc: linux-...@vger.kernel.org; devicet...@vger.kernel.org; linuxppc-
> d...@lists.ozlabs.org; linux-ker...@vger.kernel.org; Yangbo Lu
> Subject: Re: [PATCH v11 4/5] powerpc/fsl: move mpc85xx.h to
> include/linux/fsl
> 
> On Thu, 2016-07-21 at 20:26 +1000, Michael Ellerman wrote:
> > Quoting Scott Wood (2016-07-21 04:31:48)
> > >
> > > On Wed, 2016-07-20 at 13:24 +0200, Arnd Bergmann wrote:
> > > >
> > > > On Saturday, July 16, 2016 9:50:21 PM CEST Scott Wood wrote:
> > > > >
> > > > >
> > > > > From: yangbo lu 
> > > > >
> > > > > Move mpc85xx.h to include/linux/fsl and rename it to svr.h as a
> > > > > common header file.  This SVR numberspace is used on some ARM
> > > > > chips as well as PPC, and even to check for a PPC SVR multi-arch
> > > > > drivers would otherwise need to ifdef the header inclusion and
> > > > > all references to the SVR symbols.
> > > > >
> > > > > Signed-off-by: Yangbo Lu 
> > > > > Acked-by: Wolfram Sang 
> > > > > Acked-by: Stephen Boyd 
> > > > > Acked-by: Joerg Roedel 
> > > > > [scottwood: update description]
> > > > > Signed-off-by: Scott Wood 
> > > > >
> > > > As discussed before, please don't introduce yet another vendor
> > > > specific way to match a SoC ID from a device driver.
> > > >
> > > > I've posted a patch for an extension to the soc_device
> > > > infrastructure to allow comparing the running SoC to a table of
> > > > devices, use that instead.
> > > As I asked before, in which relevant maintainership capacity are you
> > > NACKing this?
> > I'll nack the powerpc part until you guys can agree.
> 
> OK, I've pulled these patches out.
> 
> For the MMC issue I suggest using ifdef CONFIG_PPC and mfspr(SPRN_SVR)
> like the clock driver does[1] and we can revisit the issue if/when we
> need to do something similar on an ARM chip.

[Lu Yangbo-B47093] I remembered that Uffe had opposed us to introduce 
non-generic header files(like '#include ')
in mmc driver initially. So I think it will not be accepted to use ifdef 
CONFIG_PPC and mfspr(SPRN_SVR)...
And this method still couldn’t get SVR of ARM chip now.

Any other suggestion here?
Thank you very much.

- Yangbo Lu

> 
> -Scott
> 
> [1] One of the issues with Arnd's approach is that it wouldn't have
> worked for early things like the clock driver, and he didn't seem to mind
> using ifdef and
> mfspr() there.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/3] powerpc/mm: Rename hpte_init_lpar() & put fallback in a header

2016-07-25 Thread Benjamin Herrenschmidt
On Mon, 2016-07-25 at 20:36 +1000, Michael Ellerman wrote:
> That would be nice, but these look fishy at least:
> 
> arch/powerpc/platforms/cell/spu_manage.c:   if 
> (!firmware_has_feature(FW_FEATURE_LPAR))
> arch/powerpc/platforms/cell/spu_manage.c:   if 
> (!firmware_has_feature(FW_FEATURE_LPAR)) {
> > arch/powerpc/platforms/cell/spu_manage.c:   if 
> > (!firmware_has_feature(FW_FEATURE_LPAR))

Those can just be checks for LV1, I think .. 

> > arch/powerpc/platforms/pasemi/iommu.c:  
> > !firmware_has_feature(FW_FEATURE_LPAR)) {
> drivers/net/ethernet/pasemi/pasemi_mac.c:   return 
> firmware_has_feature(FW_FEATURE_LPAR);

And that was some experiemtal PAPR'ish thing wasn't it ?

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH for-4.8 V2 08/10] powerpc: use the jump label for cpu_has_feature

2016-07-25 Thread Kevin Hao
On Mon, Jul 25, 2016 at 04:28:49PM +1000, Nicholas Piggin wrote:
> On Sat, 23 Jul 2016 14:42:41 +0530
> "Aneesh Kumar K.V"  wrote:
> 
> > From: Kevin Hao 
> > 
> > The cpu features are fixed once the probe of cpu features are done.
> > And the function cpu_has_feature() does be used in some hot path.
> > The checking of the cpu features for each time of invoking of
> > cpu_has_feature() seems suboptimal. This tries to reduce this
> > overhead of this check by using jump label.
> > 
> > The generated assemble code of the following c program:
> > if (cpu_has_feature(CPU_FTR_XXX))
> > xxx()
> > 
> > Before:
> > lis r9,-16230
> > lwz r9,12324(r9)
> > lwz r9,12(r9)
> > andi.   r10,r9,512
> > beqlr-
> > 
> > After:
> > nop if CPU_FTR_XXX is enabled
> > b xxx   if CPU_FTR_XXX is not enabled
> > 
> > Signed-off-by: Kevin Hao 
> > Signed-off-by: Aneesh Kumar K.V 
> > ---
> >  arch/powerpc/include/asm/cpufeatures.h | 21 +
> >  arch/powerpc/include/asm/cputable.h|  8 
> >  arch/powerpc/kernel/cputable.c | 20 
> >  arch/powerpc/lib/feature-fixups.c  |  1 +
> >  4 files changed, 50 insertions(+)
> > 
> > diff --git a/arch/powerpc/include/asm/cpufeatures.h
> > b/arch/powerpc/include/asm/cpufeatures.h index
> > bfa6cb8f5629..4a4a0b898463 100644 ---
> > a/arch/powerpc/include/asm/cpufeatures.h +++
> > b/arch/powerpc/include/asm/cpufeatures.h @@ -13,10 +13,31 @@ static
> > inline bool __cpu_has_feature(unsigned long feature)
> > return !!(CPU_FTRS_POSSIBLE & cur_cpu_spec->cpu_features & feature); }
> >  
> > +#ifdef CONFIG_JUMP_LABEL
> > +#include 
> > +
> > +extern struct static_key_true cpu_feat_keys[MAX_CPU_FEATURES];
> > +
> > +static __always_inline bool cpu_has_feature(unsigned long feature)
> > +{
> > +   int i;
> > +
> > +   if (CPU_FTRS_ALWAYS & feature)
> > +   return true;
> > +
> > +   if (!(CPU_FTRS_POSSIBLE & feature))
> > +   return false;
> > +
> > +   i = __builtin_ctzl(feature);
> > +   return static_branch_likely(_feat_keys[i]);
> > +}
> 
> Is feature ever not-constant, or could it ever be, I wonder? We could
> do a build time check to ensure it is always constant?

In the current code, all the using of this function are passing a constant
argument. But yes, due to the implementation of jump label, we should add
a check here to ensure that a constant is passed to this function. Something
likes this:

if (!__builtin_constant_p(feature))
return __cpu_has_feature(feature);

We need the same change for the mmu_has_feature().

Thanks,
Kevin


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] powerpc/mm: Fix build break due when PPC_NATIVE=n

2016-07-25 Thread Michael Ellerman
Quoting Michael Ellerman (2016-07-25 16:17:52)
> Stephen Rothwell  writes:
> 
> > Hi Michael,
> >
> > On Mon, 25 Jul 2016 12:57:49 +1000 Michael Ellerman  
> > wrote:
> >>
> >> The recent commit to rework the hash MMU setup broke the build when
> >> CONFIG_PPC_NATIVE=n. Fix it by providing a fallback implementation of
> >> hpte_init_native().
> >
> > Alternatively, you could make the call site dependent on
> > IS_ENABLED(CONFIG_PPC_NATIVE) and not need the fallback.
> >
> > so:
> >
> >   else if (IS_ENABLED(CONFIG_PPC_NATIVE))
> >   hpte_init_native();
> >
> > in arch/powerpc/mm/hash_utils_64.c and let the compiler elide the call.
> 
> That would mean we might fall through and not assign any ops, so I think
> it's preferable to have a fallback that explicitly panics().

Actually I think this works and is smaller all round.

Will test and resend.

cheers

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 341632471b9d..e44f2d759055 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -885,11 +885,6 @@ static void __init htab_initialize(void)
 #undef KB
 #undef MB
 
-void __init __weak hpte_init_lpar(void)
-{
-   panic("FW_FEATURE_LPAR set but no LPAR support compiled\n");
-}
-
 void __init hash__early_init_mmu(void)
 {
/*
@@ -931,9 +926,12 @@ void __init hash__early_init_mmu(void)
ps3_early_mm_init();
else if (firmware_has_feature(FW_FEATURE_LPAR))
hpte_init_lpar();
-   else
+   else if IS_ENABLED(CONFIG_PPC_NATIVE)
hpte_init_native();
 
+   if (!mmu_hash_ops.hpte_insert)
+   panic("hash__early_init_mmu: No MMU hash ops defined!\n");
+
/* Initialize the MMU Hash table and create the linear mapping
 * of memory. Has to be done before SLB initialization as this is
 * currently where the page size encoding is obtained.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/3] powerpc/mm: Rename hpte_init_lpar() & put fallback in a header

2016-07-25 Thread Michael Ellerman
Benjamin Herrenschmidt  writes:

> On Mon, 2016-07-25 at 15:33 +1000, Michael Ellerman wrote:
>> When we detect a PS3 we set both PS3_LV1 and LPAR at the same time,
>> so
>> there should be no way they can get out of sync, other than due to a
>> bug in the code.
>
> I thought I had changed PS3 to no longer set LPAR ?

Nope:

FW_FEATURE_PS3_POSSIBLE = FW_FEATURE_LPAR | FW_FEATURE_PS3_LV1,

...

#ifdef CONFIG_PPC_PS3
/* Identify PS3 firmware */
if (of_flat_dt_is_compatible(of_get_flat_dt_root(), "sony,ps3"))
powerpc_firmware_features |= FW_FEATURE_PS3_POSSIBLE;
#endif

> I like having a flag that basically says PAPR and that's pretty much
> what LPAR is, in fact I think I've been using it elsewhere with that
> meaning

That would be nice, but these look fishy at least:

arch/powerpc/platforms/cell/spu_manage.c:   if 
(!firmware_has_feature(FW_FEATURE_LPAR))
arch/powerpc/platforms/cell/spu_manage.c:   if 
(!firmware_has_feature(FW_FEATURE_LPAR)) {
arch/powerpc/platforms/cell/spu_manage.c:   if 
(!firmware_has_feature(FW_FEATURE_LPAR))
arch/powerpc/platforms/pasemi/iommu.c:  
!firmware_has_feature(FW_FEATURE_LPAR)) {
drivers/net/ethernet/pasemi/pasemi_mac.c:   return 
firmware_has_feature(FW_FEATURE_LPAR);

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/3] powerpc/mm: Rename hpte_init_lpar() & put fallback in a header

2016-07-25 Thread Benjamin Herrenschmidt
On Mon, 2016-07-25 at 15:33 +1000, Michael Ellerman wrote:
> When we detect a PS3 we set both PS3_LV1 and LPAR at the same time,
> so
> there should be no way they can get out of sync, other than due to a
> bug in the code.

I thought I had changed PS3 to no longer set LPAR ? I like having a
flag that basically says PAPR and that's pretty much what LPAR is,
in fact I think I've been using it elsewhere with that meaning

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH v3 02/11] mm: Hardened usercopy

2016-07-25 Thread David Laight
From: Josh Poimboeuf
> Sent: 22 July 2016 18:46
..
> > >> +/*
> > >> + * Checks if a given pointer and length is contained by the current
> > >> + * stack frame (if possible).
> > >> + *
> > >> + *   0: not at all on the stack
> > >> + *   1: fully within a valid stack frame
> > >> + *   2: fully on the stack (when can't do frame-checking)
> > >> + *   -1: error condition (invalid stack position or bad stack frame)
> > >> + */
> > >> +static noinline int check_stack_object(const void *obj, unsigned long 
> > >> len)
> > >> +{
> > >> + const void * const stack = task_stack_page(current);
> > >> + const void * const stackend = stack + THREAD_SIZE;
> > >
> > > That allows access to the entire stack, including the struct thread_info,
> > > is that what we want - it seems dangerous? Or did I miss a check
> > > somewhere else?
> >
> > That seems like a nice improvement to make, yeah.
> >
> > > We have end_of_stack() which computes the end of the stack taking
> > > thread_info into account (end being the opposite of your end above).
> >
> > Amusingly, the object_is_on_stack() check in sched.h doesn't take
> > thread_info into account either. :P Regardless, I think using
> > end_of_stack() may not be best. To tighten the check, I think we could
> > add this after checking that the object is on the stack:
> >
> > #ifdef CONFIG_STACK_GROWSUP
> > stackend -= sizeof(struct thread_info);
> > #else
> > stack += sizeof(struct thread_info);
> > #endif
> >
> > e.g. then if the pointer was in the thread_info, the second test would
> > fail, triggering the protection.
> 
> FWIW, this won't work right on x86 after Andy's
> CONFIG_THREAD_INFO_IN_TASK patches get merged.

What ends up in the 'thread_info' area?
If it contains the fp save area then programs like gdb may end up requesting
copy_in/out directly from that area.

Interestingly the avx registers don't need saving on a normal system call
entry (they are all caller-saved) so the kernel stack can safely overwrite
that area.
Syscall entry probably ought to execute the 'zero all avx registers' 
instruction.
They do need saving on interrupt entry - but the stack used will be less.

David

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[Patch v3 1/3] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe

2016-07-25 Thread Zhao Qiang
move the driver from drivers/soc/fsl/qe to drivers/irqchip,
merge qe_ic.h and qe_ic.c into irq-qeic.c.

Signed-off-by: Zhao Qiang 
---
Changes for v2:
- modify the subject and commit msg
Changes for v3:
- merge .h file to .c, rename it with irq-qeic.c

 drivers/irqchip/Makefile   |   1 +
 drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} |  82 +++-
 drivers/soc/fsl/qe/Makefile|   2 +-
 drivers/soc/fsl/qe/qe_ic.h | 103 -
 4 files changed, 83 insertions(+), 105 deletions(-)
 rename drivers/{soc/fsl/qe/qe_ic.c => irqchip/irq-qeic.c} (85%)
 delete mode 100644 drivers/soc/fsl/qe/qe_ic.h

diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 38853a1..cef999d 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -69,3 +69,4 @@ obj-$(CONFIG_PIC32_EVIC)  += irq-pic32-evic.o
 obj-$(CONFIG_MVEBU_ODMI)   += irq-mvebu-odmi.o
 obj-$(CONFIG_LS_SCFG_MSI)  += irq-ls-scfg-msi.o
 obj-$(CONFIG_EZNPS_GIC)+= irq-eznps.o
+obj-$(CONFIG_QUICC_ENGINE) += qe_ic.o
diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/irqchip/irq-qeic.c
similarity index 85%
rename from drivers/soc/fsl/qe/qe_ic.c
rename to drivers/irqchip/irq-qeic.c
index ec2ca86..1f91225 100644
--- a/drivers/soc/fsl/qe/qe_ic.c
+++ b/drivers/irqchip/irq-qeic.c
@@ -30,7 +30,87 @@
 #include 
 #include 
 
-#include "qe_ic.h"
+#define NR_QE_IC_INTS  64
+
+/* QE IC registers offset */
+#define QEIC_CICR  0x00
+#define QEIC_CIVEC 0x04
+#define QEIC_CRIPNR0x08
+#define QEIC_CIPNR 0x0c
+#define QEIC_CIPXCC0x10
+#define QEIC_CIPYCC0x14
+#define QEIC_CIPWCC0x18
+#define QEIC_CIPZCC0x1c
+#define QEIC_CIMR  0x20
+#define QEIC_CRIMR 0x24
+#define QEIC_CICNR 0x28
+#define QEIC_CIPRTA0x30
+#define QEIC_CIPRTB0x34
+#define QEIC_CRICR 0x3c
+#define QEIC_CHIVEC0x60
+
+/* Interrupt priority registers */
+#define CIPCC_SHIFT_PRI0   29
+#define CIPCC_SHIFT_PRI1   26
+#define CIPCC_SHIFT_PRI2   23
+#define CIPCC_SHIFT_PRI3   20
+#define CIPCC_SHIFT_PRI4   13
+#define CIPCC_SHIFT_PRI5   10
+#define CIPCC_SHIFT_PRI6   7
+#define CIPCC_SHIFT_PRI7   4
+
+/* CICR priority modes */
+#define CICR_GWCC  0x0004
+#define CICR_GXCC  0x0002
+#define CICR_GYCC  0x0001
+#define CICR_GZCC  0x0008
+#define CICR_GRTA  0x0020
+#define CICR_GRTB  0x0040
+#define CICR_HPIT_SHIFT8
+#define CICR_HPIT_MASK 0x0300
+#define CICR_HP_SHIFT  24
+#define CICR_HP_MASK   0x3f00
+
+/* CICNR */
+#define CICNR_WCC1T_SHIFT  20
+#define CICNR_ZCC1T_SHIFT  28
+#define CICNR_YCC1T_SHIFT  12
+#define CICNR_XCC1T_SHIFT  4
+
+/* CRICR */
+#define CRICR_RTA1T_SHIFT  20
+#define CRICR_RTB1T_SHIFT  28
+
+/* Signal indicator */
+#define SIGNAL_MASK3
+#define SIGNAL_HIGH2
+#define SIGNAL_LOW 0
+
+struct qe_ic {
+   /* Control registers offset */
+   volatile u32 __iomem *regs;
+
+   /* The remapper for this QEIC */
+   struct irq_domain *irqhost;
+
+   /* The "linux" controller struct */
+   struct irq_chip hc_irq;
+
+   /* VIRQ numbers of QE high/low irqs */
+   unsigned int virq_high;
+   unsigned int virq_low;
+};
+
+/*
+ * QE interrupt controller internal structure
+ */
+struct qe_ic_info {
+   u32 mask; /* location of this source at the QIMR register. */
+   u32 mask_reg; /* Mask register offset */
+   u8  pri_code; /* for grouped interrupts sources - the interrupt
+code as appears at the group priority register */
+   u32 pri_reg;  /* Group priority register offset */
+};
 
 static DEFINE_RAW_SPINLOCK(qe_ic_lock);
 
diff --git a/drivers/soc/fsl/qe/Makefile b/drivers/soc/fsl/qe/Makefile
index 2031d38..51e4726 100644
--- a/drivers/soc/fsl/qe/Makefile
+++ b/drivers/soc/fsl/qe/Makefile
@@ -1,7 +1,7 @@
 #
 # Makefile for the linux ppc-specific parts of QE
 #
-obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_ic.o qe_io.o
+obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_io.o
 obj-$(CONFIG_CPM)  += qe_common.o
 obj-$(CONFIG_UCC)  += ucc.o
 obj-$(CONFIG_UCC_SLOW) += ucc_slow.o
diff --git a/drivers/soc/fsl/qe/qe_ic.h b/drivers/soc/fsl/qe/qe_ic.h
deleted file mode 100644
index 926a2ed..000
--- a/drivers/soc/fsl/qe/qe_ic.h
+++ /dev/null
@@ -1,103 +0,0 @@
-/*
- * drivers/soc/fsl/qe/qe_ic.h
- *
- * QUICC ENGINE Interrupt Controller Header
- *
- * Copyright (C) 2006 Freescale Semiconductor, Inc. All rights reserved.
- *
- * Author: Li Yang 
- * Based on 

[Patch v3 2/3] irqchip/qeic: merge qeic init code from platforms to a common function

2016-07-25 Thread Zhao Qiang
The codes of qe_ic init from a variety of platforms are redundant,
merge them to a common function and put it to irqchip/irq-qeic.c

For non-p1021_mds mpc85xx_mds boards, use "qe_ic_init(np, 0,
qe_ic_cascade_low_mpic, qe_ic_cascade_high_mpic);" instead of
"qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL);".

qe_ic_cascade_muxed_mpic was used for boards has the same interrupt
number for low interrupt and high interrupt, qe_ic_init has checked
if "low interrupt == high interrupt"

Signed-off-by: Zhao Qiang 
---
Changes for v2:
- modify subject and commit msg
- add check for qeic by type
Changes for v3:
- na

 arch/powerpc/platforms/83xx/misc.c| 15 ---
 arch/powerpc/platforms/85xx/corenet_generic.c |  9 -
 arch/powerpc/platforms/85xx/mpc85xx_mds.c | 14 --
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 16 
 arch/powerpc/platforms/85xx/twr_p102x.c   | 14 --
 drivers/irqchip/irq-qeic.c| 16 
 6 files changed, 16 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/platforms/83xx/misc.c 
b/arch/powerpc/platforms/83xx/misc.c
index 7e923ca..9431fc7 100644
--- a/arch/powerpc/platforms/83xx/misc.c
+++ b/arch/powerpc/platforms/83xx/misc.c
@@ -93,24 +93,9 @@ void __init mpc83xx_ipic_init_IRQ(void)
 }
 
 #ifdef CONFIG_QUICC_ENGINE
-void __init mpc83xx_qe_init_IRQ(void)
-{
-   struct device_node *np;
-
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (!np) {
-   np = of_find_node_by_type(NULL, "qeic");
-   if (!np)
-   return;
-   }
-   qe_ic_init(np, 0, qe_ic_cascade_low_ipic, qe_ic_cascade_high_ipic);
-   of_node_put(np);
-}
-
 void __init mpc83xx_ipic_and_qe_init_IRQ(void)
 {
mpc83xx_ipic_init_IRQ();
-   mpc83xx_qe_init_IRQ();
 }
 #endif /* CONFIG_QUICC_ENGINE */
 
diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c 
b/arch/powerpc/platforms/85xx/corenet_generic.c
index a2b0bc8..526fc2b 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -41,8 +41,6 @@ void __init corenet_gen_pic_init(void)
unsigned int flags = MPIC_BIG_ENDIAN | MPIC_SINGLE_DEST_CPU |
MPIC_NO_RESET;
 
-   struct device_node *np;
-
if (ppc_md.get_irq == mpic_get_coreint_irq)
flags |= MPIC_ENABLE_COREINT;
 
@@ -50,13 +48,6 @@ void __init corenet_gen_pic_init(void)
BUG_ON(mpic == NULL);
 
mpic_init(mpic);
-
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (np) {
-   qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
-   qe_ic_cascade_high_mpic);
-   of_node_put(np);
-   }
 }
 
 /*
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c 
b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
index f61cbe2..7ae4901 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
@@ -279,20 +279,6 @@ static void __init mpc85xx_mds_qeic_init(void)
of_node_put(np);
return;
}
-
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (!np) {
-   np = of_find_node_by_type(NULL, "qeic");
-   if (!np)
-   return;
-   }
-
-   if (machine_is(p1021_mds))
-   qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
-   qe_ic_cascade_high_mpic);
-   else
-   qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL);
-   of_node_put(np);
 }
 #else
 static void __init mpc85xx_mds_qe_init(void) { }
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c 
b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
index 3f4dad1..779f54f 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
@@ -49,10 +49,6 @@ void __init mpc85xx_rdb_pic_init(void)
struct mpic *mpic;
unsigned long root = of_get_flat_dt_root();
 
-#ifdef CONFIG_QUICC_ENGINE
-   struct device_node *np;
-#endif
-
if (of_flat_dt_is_compatible(root, "fsl,MPC85XXRDB-CAMP")) {
mpic = mpic_alloc(NULL, 0, MPIC_NO_RESET |
MPIC_BIG_ENDIAN |
@@ -67,18 +63,6 @@ void __init mpc85xx_rdb_pic_init(void)
 
BUG_ON(mpic == NULL);
mpic_init(mpic);
-
-#ifdef CONFIG_QUICC_ENGINE
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (np) {
-   qe_ic_init(np, 0, qe_ic_cascade_low_mpic,
-   qe_ic_cascade_high_mpic);
-   of_node_put(np);
-
-   } else
-   pr_err("%s: Could not find qe-ic node\n", __func__);
-#endif
-
 }
 
 /*
diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c 
b/arch/powerpc/platforms/85xx/twr_p102x.c
index 71bc255..603e244 100644
--- a/arch/powerpc/platforms/85xx/twr_p102x.c
+++ 

[Patch v3 3/3] irqchip/qeic: merge qeic_of_init into qe_ic_init

2016-07-25 Thread Zhao Qiang
qeic_of_init just get device_node of qeic from dtb and call qe_ic_init,
pass the device_node to qe_ic_init.
So merge qeic_of_init into qe_ic_init to get the qeic node in
qe_ic_init.

Signed-off-by: Zhao Qiang 
---
Changes for v2:
- modify subject and commit msg
- return 0 and add put node when return in qe_ic_init
Changes for v3:
- na

 drivers/irqchip/irq-qeic.c | 91 +-
 include/soc/fsl/qe/qe_ic.h |  7 
 2 files changed, 50 insertions(+), 48 deletions(-)

diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c
index 1853fda..a0bf871 100644
--- a/drivers/irqchip/irq-qeic.c
+++ b/drivers/irqchip/irq-qeic.c
@@ -397,27 +397,38 @@ unsigned int qe_ic_get_high_irq(struct qe_ic *qe_ic)
return irq_linear_revmap(qe_ic->irqhost, irq);
 }
 
-void __init qe_ic_init(struct device_node *node, unsigned int flags,
-  void (*low_handler)(struct irq_desc *desc),
-  void (*high_handler)(struct irq_desc *desc))
+static int __init qe_ic_init(unsigned int flags)
 {
+   struct device_node *node;
struct qe_ic *qe_ic;
struct resource res;
-   u32 temp = 0, ret, high_active = 0;
+   u32 temp = 0, high_active = 0;
+   int ret = 0;
+
+   node = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
+   if (!node) {
+   node = of_find_node_by_type(NULL, "qeic");
+   if (!node)
+   return -ENODEV;
+   }
 
ret = of_address_to_resource(node, 0, );
-   if (ret)
-   return;
+   if (ret) {
+   ret = -ENODEV;
+   goto err_put_node;
+   }
 
qe_ic = kzalloc(sizeof(*qe_ic), GFP_KERNEL);
-   if (qe_ic == NULL)
-   return;
+   if (qe_ic == NULL) {
+   ret = -ENOMEM;
+   goto err_put_node;
+   }
 
qe_ic->irqhost = irq_domain_add_linear(node, NR_QE_IC_INTS,
   _ic_host_ops, qe_ic);
if (qe_ic->irqhost == NULL) {
-   kfree(qe_ic);
-   return;
+   ret = -ENOMEM;
+   goto err_free_qe_ic;
}
 
qe_ic->regs = ioremap(res.start, resource_size());
@@ -428,9 +439,9 @@ void __init qe_ic_init(struct device_node *node, unsigned 
int flags,
qe_ic->virq_low = irq_of_parse_and_map(node, 1);
 
if (qe_ic->virq_low == NO_IRQ) {
-   printk(KERN_ERR "Failed to map QE_IC low IRQ\n");
-   kfree(qe_ic);
-   return;
+   pr_err("Failed to map QE_IC low IRQ\n");
+   ret = -ENOMEM;
+   goto err_domain_remove;
}
 
/* default priority scheme is grouped. If spread mode is*/
@@ -457,13 +468,24 @@ void __init qe_ic_init(struct device_node *node, unsigned 
int flags,
qe_ic_write(qe_ic->regs, QEIC_CICR, temp);
 
irq_set_handler_data(qe_ic->virq_low, qe_ic);
-   irq_set_chained_handler(qe_ic->virq_low, low_handler);
+   irq_set_chained_handler(qe_ic->virq_low, qe_ic_cascade_low_mpic);
 
if (qe_ic->virq_high != NO_IRQ &&
qe_ic->virq_high != qe_ic->virq_low) {
irq_set_handler_data(qe_ic->virq_high, qe_ic);
-   irq_set_chained_handler(qe_ic->virq_high, high_handler);
+   irq_set_chained_handler(qe_ic->virq_high,
+   qe_ic_cascade_high_mpic);
}
+   of_node_put(node);
+   return 0;
+
+err_domain_remove:
+   irq_domain_remove(qe_ic->irqhost);
+err_free_qe_ic:
+   kfree(qe_ic);
+err_put_node:
+   of_node_put(node);
+   return ret;
 }
 
 void qe_ic_set_highest_priority(unsigned int virq, int high)
@@ -570,39 +592,26 @@ static struct device device_qe_ic = {
.bus = _ic_subsys,
 };
 
-static int __init init_qe_ic_sysfs(void)
+static int __init init_qe_ic(void)
 {
-   int rc;
+   int ret;
 
-   printk(KERN_DEBUG "Registering qe_ic with sysfs...\n");
+   ret = qe_ic_init(0);
+   if (ret)
+   return ret;
 
-   rc = subsys_system_register(_ic_subsys, NULL);
-   if (rc) {
-   printk(KERN_ERR "Failed registering qe_ic sys class\n");
+   ret = subsys_system_register(_ic_subsys, NULL);
+   if (ret) {
+   pr_err("Failed registering qe_ic sys class\n");
return -ENODEV;
}
-   rc = device_register(_qe_ic);
-   if (rc) {
-   printk(KERN_ERR "Failed registering qe_ic sys device\n");
+   ret = device_register(_qe_ic);
+   if (ret) {
+   pr_err("Failed registering qe_ic sys device\n");
return -ENODEV;
}
-   return 0;
-}
-
-static int __init qeic_of_init(void)
-{
-   struct device_node *np;
 
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (!np) {
-   np = 

Re: [PATCH for-4.8 V2 03/10] powerpc/mm/radix: Add radix_set_pte to use in early init

2016-07-25 Thread Nicholas Piggin
On Mon, 25 Jul 2016 18:36:09 +1000
Michael Ellerman  wrote:

> "Aneesh Kumar K.V"  writes:
> 
> > We want to use the static key based feature check in set_pte_at.
> > Since we call radix__map_kernel_page early in boot before jump
> > label is initialized we can't call set_pte_at there. Add
> > radix__set_pte for the same.  
> 
> Although this is an OK solution to this problem, I think it
> highlights a bigger problem, which is that we're still doing the
> feature patching too late.
> 
> If we can move the feature patching prior to MMU init, then all (or
> more of) these problems with pre vs post patching go away.
> 
> I'll see if I can come up with something tomorrow.

Agreed, that would be much nicer if you can make it work.

Thanks,
Nick
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v4] powerpc: Export thread_struct.used_vr/used_vsr to user space

2016-07-25 Thread Simon Guo

On Thu, Jul 21, 2016 at 08:57:29PM +1000, Michael Ellerman wrote:
> Can one of you send a properly formatted and signed-off patch.

I will work on that.

Thanks,
Simon
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH for-4.8 V2 03/10] powerpc/mm/radix: Add radix_set_pte to use in early init

2016-07-25 Thread Michael Ellerman
"Aneesh Kumar K.V"  writes:

> We want to use the static key based feature check in set_pte_at. Since
> we call radix__map_kernel_page early in boot before jump label is
> initialized we can't call set_pte_at there. Add radix__set_pte for the
> same.

Although this is an OK solution to this problem, I think it highlights a
bigger problem, which is that we're still doing the feature patching too
late.

If we can move the feature patching prior to MMU init, then all (or more
of) these problems with pre vs post patching go away.

I'll see if I can come up with something tomorrow.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH for-4.8 V2 03/10] powerpc/mm/radix: Add radix_set_pte to use in early init

2016-07-25 Thread Michael Ellerman
Nicholas Piggin  writes:

> On Sat, 23 Jul 2016 14:42:36 +0530
> "Aneesh Kumar K.V"  wrote:
>> @@ -102,7 +123,7 @@ int radix__map_kernel_page(unsigned long ea,
>> unsigned long pa, }
>>  
>>  set_the_pte:
>> -set_pte_at(_mm, ea, ptep, pfn_pte(pa >> PAGE_SHIFT,
>> flags));
>> +radix__set_pte(_mm, ea, ptep, pfn_pte(pa >> PAGE_SHIFT,
>> flags)); smp_wmb();
>
> What we have in existing code is set_pte_at() function that adds
> the _PAGE_PTE bit, then calls __set_pte_at(), which calls radix or hash
> version of __set_pte_at().
>
> Now we also have radix__set_pte(), which has the function of the
> set_pte_at(), which is starting to confuse the naming convention.
> The new function is a radix-only set_pte_at(), rather than the
> radix implementation that gets called via set_pte().
>
> set_pte_at_radix()? That kind of sucks too, though. It might be better
> if the radix/hash variants were called __radix__set_pte_at(), and this
> new function was called radix__set_pte_at().

I think Aneesh originally used set_pte_at_r() or maybe rset_pte_at()?

It was my idea to use radix__ and hash__ as prefixes for all the
radix/hash functions.

That was 1) to make it clear that it's not part of the name as such, ie.
it's a prefix, and 2) because it's ugly as hell and hopefully that would
motivate us to consolidate as many of them as possible.

I balked at adding __radix__set_pte_at(), and just went with
radix__set_pte_at(). But it does complicate things now.

In fact I think we need to rethink this whole series, and not actually
do it this way at all, meaning this naming problem will go away.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] irqchip/qeic: move qeic driver from drivers/soc/fsl/qe

2016-07-25 Thread Scott Wood
On Mon, 2016-07-25 at 06:15 +, Qiang Zhao wrote:
> On Thu, Jul 07, 2016 at 10:25PM , Jason Cooper  wrote:
> > 
> > -Original Message-
> > From: Jason Cooper [mailto:ja...@lakedaemon.net]
> > Sent: Thursday, July 07, 2016 10:25 PM
> > To: Qiang Zhao 
> > Cc: o...@buserror.net; t...@linutronix.de; marc.zyng...@arm.com; linuxppc-
> > d...@lists.ozlabs.org; linux-ker...@vger.kernel.org; Xiaobo Xie
> > 
> > Subject: Re: [PATCH v2] irqchip/qeic: move qeic driver from
> > drivers/soc/fsl/qe
> > 
> > Hi Zhao Qiang,
> > 
> > On Thu, Jul 07, 2016 at 09:23:55AM +0800, Zhao Qiang wrote:
> > > 
> > > The driver stays the same.
> > > 
> > > Signed-off-by: Zhao Qiang 
> > > ---
> > > Changes for v2:
> > >   - modify the subject and commit msg
> > > 
> > >  drivers/irqchip/Makefile| 1 +
> > >  drivers/{soc/fsl/qe => irqchip}/qe_ic.c | 0  drivers/{soc/fsl/qe =>
> > > irqchip}/qe_ic.h | 0
> > >  drivers/soc/fsl/qe/Makefile | 2 +-
> > >  4 files changed, 2 insertions(+), 1 deletion(-)  rename
> > > drivers/{soc/fsl/qe => irqchip}/qe_ic.c (100%)  rename
> > > drivers/{soc/fsl/qe => irqchip}/qe_ic.h (100%)
> > Please merge the include file into the C file and rename to follow the
> > naming
> > convention in drivers/irqchip/.  e.g. irq-qeic.c or irq-qe_ic.c.
> > 
> > Once you have that, please resend the entire series with this as the first
> > patch.
> Sorry, I have no idea about "Include file", could you explain which file you
> meant?


qe_ic.h

If nothing else is going to include that, then the contents can go directly
into qe_ic.c.

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH for-4.8 V2 00/10] Use jump label for cpu/mmu_has_feature

2016-07-25 Thread Nicholas Piggin
On Mon, 25 Jul 2016 11:55:50 +0530
"Aneesh Kumar K.V"  wrote:

> Nicholas Piggin  writes:
> 
> > On Sat, 23 Jul 2016 14:42:33 +0530
> > "Aneesh Kumar K.V"  wrote:
> >  
> >> Changes from V1:
> >> * Update "powerpc/mm: Convert early cpu/mmu feature check to use
> >> the new helpers" based on resend code changes in this area.
> >> 
> >> We now do feature fixup early and hence we can reduce the usage of
> >>  __cpu/__mmu_has_feature.  
> >
> > Is there a particular reason for for-4.8?
> >
> > I've only just started following this development so it might be
> > obvious, but if you could add some small justifications for why
> > a patch or series is done, it would be of great help to me.  
> 
> The goal is to reduce the impact of radix series on existing MMU
> function. With radix series, we do
> 
> if (radix_enabled())
> radix_function()
> else
> hash_function()
> 
> We did try to reduce the impact in most code path like linux page
> table accessors by moving linux pte bits around to match the
> radix/hardware requirements. But we still have other code paths where
> we do the above conditional.
> 
> Now for-4.8 is mainly because, I was trying to make sure 4.8 release
> will have a good performing radix/hash implementation which distros
> can base their kernel on. This series was posted to external list
> multiple times and I didn't receive many objections to the series.
> Hence I was thinking it to be a good idea to get it upstream by 4.8.

Thanks, I was just curious. I don't have an objection.

It would be a bigger change, but it might be nice to do alternate
patching for some of these, so we could even avoid the branch for
the radix case in some of the critical functions. That's something
for later though.

Thanks,
Nick
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH for-4.8 V2 08/10] powerpc: use the jump label for cpu_has_feature

2016-07-25 Thread Nicholas Piggin
On Sat, 23 Jul 2016 14:42:41 +0530
"Aneesh Kumar K.V"  wrote:

> From: Kevin Hao 
> 
> The cpu features are fixed once the probe of cpu features are done.
> And the function cpu_has_feature() does be used in some hot path.
> The checking of the cpu features for each time of invoking of
> cpu_has_feature() seems suboptimal. This tries to reduce this
> overhead of this check by using jump label.
> 
> The generated assemble code of the following c program:
>   if (cpu_has_feature(CPU_FTR_XXX))
>   xxx()
> 
> Before:
>   lis r9,-16230
>   lwz r9,12324(r9)
>   lwz r9,12(r9)
>   andi.   r10,r9,512
>   beqlr-
> 
> After:
>   nop if CPU_FTR_XXX is enabled
>   b xxx   if CPU_FTR_XXX is not enabled
> 
> Signed-off-by: Kevin Hao 
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/include/asm/cpufeatures.h | 21 +
>  arch/powerpc/include/asm/cputable.h|  8 
>  arch/powerpc/kernel/cputable.c | 20 
>  arch/powerpc/lib/feature-fixups.c  |  1 +
>  4 files changed, 50 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/cpufeatures.h
> b/arch/powerpc/include/asm/cpufeatures.h index
> bfa6cb8f5629..4a4a0b898463 100644 ---
> a/arch/powerpc/include/asm/cpufeatures.h +++
> b/arch/powerpc/include/asm/cpufeatures.h @@ -13,10 +13,31 @@ static
> inline bool __cpu_has_feature(unsigned long feature)
> return !!(CPU_FTRS_POSSIBLE & cur_cpu_spec->cpu_features & feature); }
>  
> +#ifdef CONFIG_JUMP_LABEL
> +#include 
> +
> +extern struct static_key_true cpu_feat_keys[MAX_CPU_FEATURES];
> +
> +static __always_inline bool cpu_has_feature(unsigned long feature)
> +{
> + int i;
> +
> + if (CPU_FTRS_ALWAYS & feature)
> + return true;
> +
> + if (!(CPU_FTRS_POSSIBLE & feature))
> + return false;
> +
> + i = __builtin_ctzl(feature);
> + return static_branch_likely(_feat_keys[i]);
> +}

Is feature ever not-constant, or could it ever be, I wonder? We could
do a build time check to ensure it is always constant?

Or alternatively, make non-constant cases skip the first two tests?

Thanks,
Nick
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH for-4.8 V2 00/10] Use jump label for cpu/mmu_has_feature

2016-07-25 Thread Aneesh Kumar K.V
Nicholas Piggin  writes:

> On Sat, 23 Jul 2016 14:42:33 +0530
> "Aneesh Kumar K.V"  wrote:
>
>> Changes from V1:
>> * Update "powerpc/mm: Convert early cpu/mmu feature check to use the
>> new helpers" based on resend code changes in this area.
>> 
>> We now do feature fixup early and hence we can reduce the usage of
>>  __cpu/__mmu_has_feature.
>
> Is there a particular reason for for-4.8?
>
> I've only just started following this development so it might be
> obvious, but if you could add some small justifications for why
> a patch or series is done, it would be of great help to me.

The goal is to reduce the impact of radix series on existing MMU
function. With radix series, we do

if (radix_enabled())
radix_function()
else
hash_function()

We did try to reduce the impact in most code path like linux page table
accessors by moving linux pte bits around to match the radix/hardware
requirements. But we still have other code paths where we do the above
conditional.

Now for-4.8 is mainly because, I was trying to make sure 4.8 release
will have a good performing radix/hash implementation which distros can
base their kernel on. This series was posted to external list multiple
times and I didn't receive many objections to the series. Hence I was
thinking it to be a good idea to get it upstream by 4.8.
 
-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH for-4.8 V2 03/10] powerpc/mm/radix: Add radix_set_pte to use in early init

2016-07-25 Thread Nicholas Piggin
On Sat, 23 Jul 2016 14:42:36 +0530
"Aneesh Kumar K.V"  wrote:

> We want to use the static key based feature check in set_pte_at. Since
> we call radix__map_kernel_page early in boot before jump label is
> initialized we can't call set_pte_at there. Add radix__set_pte for the
> same.
> 
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/mm/pgtable-radix.c | 23 ++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/mm/pgtable-radix.c
> b/arch/powerpc/mm/pgtable-radix.c index 003ff48a11b6..6d2eb76b508e
> 100644 --- a/arch/powerpc/mm/pgtable-radix.c
> +++ b/arch/powerpc/mm/pgtable-radix.c
> @@ -39,6 +39,27 @@ static __ref void *early_alloc_pgtable(unsigned
> long size) 
>   return pt;
>  }
> +/*
> + * set_pte stores a linux PTE into the linux page table.
> + */
> +static void radix__set_pte(struct mm_struct *mm, unsigned long addr,
> pte_t *ptep,
> +pte_t pte)
> +{
> + /*
> +  * When handling numa faults, we already have the pte marked
> +  * _PAGE_PRESENT, but we can be sure that it is not in hpte.
> +  * Hence we can use set_pte_at for them.
> +  */
> + VM_WARN_ON(pte_present(*ptep) && !pte_protnone(*ptep));
> +
> + /*
> +  * Add the pte bit when tryint set a pte
> +  */
> + pte = __pte(pte_val(pte) | _PAGE_PTE);
> +
> + /* Perform the setting of the PTE */
> + radix__set_pte_at(mm, addr, ptep, pte, 0);
> +}
>  
>  int radix__map_kernel_page(unsigned long ea, unsigned long pa,
> pgprot_t flags,
> @@ -102,7 +123,7 @@ int radix__map_kernel_page(unsigned long ea,
> unsigned long pa, }
>  
>  set_the_pte:
> - set_pte_at(_mm, ea, ptep, pfn_pte(pa >> PAGE_SHIFT,
> flags));
> + radix__set_pte(_mm, ea, ptep, pfn_pte(pa >> PAGE_SHIFT,
> flags)); smp_wmb();

What we have in existing code is set_pte_at() function that adds
the _PAGE_PTE bit, then calls __set_pte_at(), which calls radix or hash
version of __set_pte_at().

Now we also have radix__set_pte(), which has the function of the
set_pte_at(), which is starting to confuse the naming convention.
The new function is a radix-only set_pte_at(), rather than the
radix implementation that gets called via set_pte().

set_pte_at_radix()? That kind of sucks too, though. It might be better
if the radix/hash variants were called __radix__set_pte_at(), and this
new function was called radix__set_pte_at().

Thanks,
Nick
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] powerpc/mm: Fix build break due when PPC_NATIVE=n

2016-07-25 Thread Michael Ellerman
Stephen Rothwell  writes:

> Hi Michael,
>
> On Mon, 25 Jul 2016 12:57:49 +1000 Michael Ellerman  
> wrote:
>>
>> The recent commit to rework the hash MMU setup broke the build when
>> CONFIG_PPC_NATIVE=n. Fix it by providing a fallback implementation of
>> hpte_init_native().
>
> Alternatively, you could make the call site dependent on
> IS_ENABLED(CONFIG_PPC_NATIVE) and not need the fallback.
>
> so:
>
>   else if (IS_ENABLED(CONFIG_PPC_NATIVE))
>   hpte_init_native();
>
> in arch/powerpc/mm/hash_utils_64.c and let the compiler elide the call.

That would mean we might fall through and not assign any ops, so I think
it's preferable to have a fallback that explicitly panics().

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev