Re: [1/3] powerpc/pmac: Fix DT refcount imbalance in pmac_pic_probe_oldstyle

2015-01-30 Thread Geert Uytterhoeven
Hi Michael,

On Fri, Jan 30, 2015 at 5:09 AM, Michael Ellerman  wrote:
> On Wed, 2015-14-01 at 13:51:57 UTC, Geert Uytterhoeven wrote:
>> of_find_node_by_name() calls of_node_put() on its "from" parameter,
>> which must not be done on "master", as it's still in use, and will be
>> released manually later.  This may cause a zero kref refcount.
>> Use of_get_child_by_name() instead to fix this.
>
> But of_find_node_by_name() searches *all* nodes, not just the children of the
> parameter.

That's correct. However, I guess the second mac-io will just be a direct child.

> So this is a logic change AFAICS, and I have no idea what machines we'd need 
> to
> test on to check it.

Originally it comes from arch/ppc/platforms/pmac_pic.c, added in 2002 in
full-history-linux commit 5ea3254844ae344a
("Import arch/ppc and include/asm-ppc changes from linuxppc_2_5 tree").

I've also checked my linuxppc mail archives from 1997-2002, but couldn't find
the actual patch and a description.

So I don't know on which machines it's needed.

> So I think an of_node_get(master) would be safer and also fix the refcounting.

If no one can confirm the above, that may indeed be the best solution.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: AW: SPE & Interrupt context (was how to make use of SPE instructions)

2015-01-30 Thread Gabriel Paubert
On Fri, Jan 30, 2015 at 05:37:29AM +, Markus Stockhausen wrote:
> > Von: Scott Wood [scottw...@freescale.com]
> > Gesendet: Freitag, 30. Januar 2015 01:49
> > An: Markus Stockhausen
> > Cc: Michael Ellerman; linuxppc-dev@lists.ozlabs.org; Herbert Xu
> > Betreff: Re: AW: SPE & Interrupt context (was how to make use of SPE 
> > instructions)
> > 
> > On Wed, 2015-01-28 at 05:00 +, Markus Stockhausen wrote:
> > > > > Von: Scott Wood [scottw...@freescale.com]
> > > > > Gesendet: Mittwoch, 28. Januar 2015 05:21
> > > > > An: Markus Stockhausen
> > > > > Cc: Michael Ellerman; linuxppc-dev@lists.ozlabs.org; Herbert Xu
> > > > > Betreff: Re: SPE & Interrupt context (was how to make use of SPE 
> > > > > instructions)
> > > > >
> > > > > Hi Scott,
> > > > >
> > > > > thanks for your helpful feedback. As you might have seen I sent a 
> > > > > first
> > > > > patch for the sha256 kernel module that takes care about preemption.
> > > > >
> > > > > Herbert Xu noticed that my module won't run in for IPsec as all
> > > > > work will be done from interrupt context. Do you have a tip how I can
> > > > > mitigate the check I implemented:
> > > > >
> > > > > static bool spe_usable(void)
> > > > > {
> > > > >   return !in_interrupt();
> > > > > }
> > > > >
> > > > > Intel guys have something like that
> > > > >
> > > > > bool irq_fpu_usable(void)
> > > > > {
> > > > >   return !in_interrupt() ||
> > > > > interrupted_user_mode() ||
> > > > > interrupted_kernel_fpu_idle();
> > > > > }
> > > > >
> > > > > But I have no idea how to transfer it to the PPC/SPE case.
> > > >
> > > > I'm not sure what sort of tip you're looking for, other than
> > > > implementing it myself. :-)
> > >
> > > Hi Scott,
> > >
> > > maybe I did not explain it correctly. interrupted_kernel_fpu_idle()
> > > is x86 specific. The same applies to interrupted_user_mode().
> > > I'm just searching for a similar feature in the PPC/SPE world.
> > 
> > There isn't one.
> > 
> > > I can see that enable_kernel_spe() does something with the
> > > MSR_SPE flag, but I have no idea  how to determine if I'm allowed
> > > to enable SPE although I'm inside an interrupt context.
> > 
> > As with x86, you'd want to check whether the kernel interrupted
> > userspace.  I don't know what x86 is doing with TS, but on PPC you might
> > check whether the interrupted thread had MSR_FP enabled.
> > 
> > > I'm asking because from the previous posts I conclude that
> > > running SPE instructions inside an interrupt might be critical.
> > > Because of registers not being saved?
> > 
> > Yes.  Currently callers of enable_kernel_spe() only need to disable
> > preemption, not interrupts.
> > 
> > > Or can I just save the register contents myself and interrupt
> > > context is no longer a showstopper?
> > 
> > If you only need a small number of registers that might be reasonable,
> > but if you need a bunch then you don't want to save them when you don't
> > have to.
> > 
> > Another option is to change enable_kernel_spe() to require interrupts to
> > be disabled.
> 
> Phew, that is going deeper than I expected. 
> 
> I'm a newbie in the topic of interrupts and FPU/SPE registers. Nevertheless
> enforcing enable_kernel_spe() to only be available outside of interrupt
> context sounds too restrictive for me. Also checking for thread/CPU flags 
> of an interrupted process is nothing I can or want to implement. There
> might be the risk that I'm starting something that will be too complex
> for me.
> 
> BUT! Given the fact that SPE registers are only extended GPRs and my
> algorithm needs just 10 of them I can live with the following design.
> 
> - I must already save several non-volatile registers. Putting the 64 bit 
> values 
> into them would require me to save their contents with evstdd instead of 
> stw. Of course stack alignment to 8 bytes required. So only a few alignment
> instructions needed additionally during initialization.

On most PPC ABI the stack is guaranteed to be aligned to a 16 byte
boundary. In some it may be only 8, but I can't remember any 4 byte
only alignment.

I checked my 32 bit kernel images with:

objdump -d vmlinux |awk '/stwu.*r1,/{print $6,$7}'|sort -u

and the stack seems to always be 16 byte aligned.
For 64 bit, use stdu instead of stwu.

I've also found a few stwux/stdux which are hopefully known
to be harmless.

> 
> - During function cleanup I will restore the registers the same way.
> 
> - In case I interrupted myself, I might have saved sensitive data of another 
> thread on my stack. So I will zero that area after I restored the registers.
> That needs an additional 10 instructions. In contrast to ~2000 instructions
> for one sha256 round that should be neglectable.
> 
> This little overhead will save me lots of trouble at other locations:
> 
> - I can avoid checking for an interrupt context.
> 
> - I don't need a fallback to the generic implementation. 
> 
> Thinking about it more and more I think I performance will stay the same. 
> C

AW: AW: SPE & Interrupt context (was how to make use of SPE instructions)

2015-01-30 Thread Markus Stockhausen
> Von: Gabriel Paubert [paub...@iram.es]
> Gesendet: Freitag, 30. Januar 2015 09:49
> An: Markus Stockhausen
> Cc: Scott Wood; linuxppc-dev@lists.ozlabs.org; Herbert Xu
> Betreff: Re: AW: SPE & Interrupt context (was how to make use of SPE 
> instructions)
>
> > ...
> > - I must already save several non-volatile registers. Putting the 64 bit 
> > values
> > into them would require me to save their contents with evstdd instead of
> > stw. Of course stack alignment to 8 bytes required. So only a few alignment
> > instructions needed additionally during initialization.
> 
> On most PPC ABI the stack is guaranteed to be aligned to a 16 byte
> boundary. In some it may be only 8, but I can't remember any 4 byte
> only alignment.
> 
> I checked my 32 bit kernel images with:
> 
> objdump -d vmlinux |awk '/stwu.*r1,/{print $6,$7}'|sort -u
> 
> and the stack seems to always be 16 byte aligned.
> For 64 bit, use stdu instead of stwu.
> 
> I've also found a few stwux/stdux which are hopefully known
> to be harmless.
>
> Gabriel

A helpful annotation. But now I'm unsure about function usage. SPE seems to be
32bit only and I would use their evxxx instructions. Do you think the following
sequence will be the right way? 

_GLOBAL(ppc_spe_sha256_transform)
  stwur1,-128(r1);/* create stack frame   */
  stw r24,8(r1);  /* save normal registers*/
  stw r25,12(r1);   
  evstdw  r14,16(r1); /* We must save non volatile*/
  evstdw  r15,24(r1);/* registers. Take the chance   */
  evstdw  r16,32(r12);/* and save the SPE part too*/ \
  ...
  lwz r24,8(r1);  /* restore normal registers */ \
  lwz r25,12(r1);
  evldw   r14,16(r12); /* restore non-v. + SPE registers  */
  evldw   r15,24(r12);
  evldw   r16,32(r12);
  addir1,r1,128;  /* cleanup stack frame  */

Or must I use the kernel provided defines with PPC_STLU r1,-INT_FRAME_SIZE(r1) 
plus SAVE_GPR/SAVE_EVR/REST_GPR/REST_EVR?

Markus
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] i2c/mpc: Fix ISR return value

2015-01-30 Thread Amit Tomar
ISR should not return IRQ_HANDLED for not handling anything. 
This patch fixes the return value of ISR for the same case.


Signed-off-by: Amit Singh Tomar 
---
drivers/i2c/busses/i2c-mpc.c |    3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-mpc.c b/drivers/i2c/busses/i2c-mpc.c index 
0edf630..7a3136f 100644
--- a/drivers/i2c/busses/i2c-mpc.c
+++ b/drivers/i2c/busses/i2c-mpc.c
@@ -95,8 +95,9 @@ static irqreturn_t mpc_i2c_isr(int irq, void *dev_id)
    i2c->interrupt = readb(i2c->base + MPC_I2C_SR);
    writeb(0, i2c->base + MPC_I2C_SR);
    wake_up(&i2c->queue);
+       return IRQ_HANDLED;
    }
-  return IRQ_HANDLED;
+  return IRQ_NONE;
}

/* Sometimes 9th clock pulse isn't generated, and slave doesn't release
--
1.7.9.5
 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: AW: SPE & Interrupt context (was how to make use of SPE instructions)

2015-01-30 Thread Gabriel Paubert
On Fri, Jan 30, 2015 at 09:39:41AM +, Markus Stockhausen wrote:
> > Von: Gabriel Paubert [paub...@iram.es]
> > Gesendet: Freitag, 30. Januar 2015 09:49
> > An: Markus Stockhausen
> > Cc: Scott Wood; linuxppc-dev@lists.ozlabs.org; Herbert Xu
> > Betreff: Re: AW: SPE & Interrupt context (was how to make use of SPE 
> > instructions)
> >
> > > ...
> > > - I must already save several non-volatile registers. Putting the 64 bit 
> > > values
> > > into them would require me to save their contents with evstdd instead of
> > > stw. Of course stack alignment to 8 bytes required. So only a few 
> > > alignment
> > > instructions needed additionally during initialization.
> > 
> > On most PPC ABI the stack is guaranteed to be aligned to a 16 byte
> > boundary. In some it may be only 8, but I can't remember any 4 byte
> > only alignment.
> > 
> > I checked my 32 bit kernel images with:
> > 
> > objdump -d vmlinux |awk '/stwu.*r1,/{print $6,$7}'|sort -u
> > 
> > and the stack seems to always be 16 byte aligned.
> > For 64 bit, use stdu instead of stwu.
> > 
> > I've also found a few stwux/stdux which are hopefully known
> > to be harmless.
> >
> > Gabriel
> 
> A helpful annotation. But now I'm unsure about function usage. SPE seems to be
> 32bit only and I would use their evxxx instructions. Do you think the 
> following
> sequence will be the right way? 
> 
> _GLOBAL(ppc_spe_sha256_transform)
>   stwur1,-128(r1);/* create stack frame   */
>   stw r24,8(r1);  /* save normal registers*/
>   stw r25,12(r1);   
>   evstdw  r14,16(r1); /* We must save non volatile*/
>   evstdw  r15,24(r1);/* registers. Take the chance   */
>   evstdw  r16,32(r12);/* and save the SPE part too*/ \
>   ...
>   lwz r24,8(r1);  /* restore normal registers */ \
>   lwz r25,12(r1);
>   evldw   r14,16(r12); /* restore non-v. + SPE registers  */
>   evldw   r15,24(r12);
>   evldw   r16,32(r12);
>   addir1,r1,128;  /* cleanup stack frame  */
> 

Yes. But there is also probably a status/control register somewhere that
you might need to save restore, unless it is never used and/or affected by the
instructions you use.

> Or must I use the kernel provided defines with PPC_STLU 
> r1,-INT_FRAME_SIZE(r1) 
> plus SAVE_GPR/SAVE_EVR/REST_GPR/REST_EVR?
> 

From what I understand INT_FRAME_SIZE is for interrupt entry code. This
is not the case of your code which is a standard function except for
the fact that it clobbers the upper 32 bits of some registers by using
SPE instructions. Therore INT_FRAME_SIZE is overkill. I also believe that
you can save the registers as you suggest, no need to split it into
the high and low part.

By the way, I wonder where the SAVE_EVR/REST_EVR macros are used. I only
see the definitions, no use in a 3.18 source tree.

Gabriel
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v1 0/3] SHA256 for PPC/SPE

2015-01-30 Thread Conor O'Gorman

On 24/01/15 21:10, Markus Stockhausen wrote:

[PATCH v1 0/3] SHA256 for PPC/SPE

The following patches add support for SIMD accelerated SHA256
calculation on PPC processors with SPE instruction set. The


Nice boost.

Many of the SoCs with e500 core have crypto hardware accelerators. How 
does this compare?


Thanks,
Conor


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

AW: [PATCH v1 0/3] SHA256 for PPC/SPE

2015-01-30 Thread Markus Stockhausen
> Von: Conor O'Gorman [i...@conorogorman.net]
> Gesendet: Freitag, 30. Januar 2015 13:02
> An: Markus Stockhausen; linux-cry...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Betreff: Re: [PATCH v1 0/3] SHA256 for PPC/SPE
> 
> On 24/01/15 21:10, Markus Stockhausen wrote:
> > [PATCH v1 0/3] SHA256 for PPC/SPE
> >
> > The following patches add support for SIMD accelerated SHA256
> > calculation on PPC processors with SPE instruction set. The
> 
> Nice boost.
> 
> Many of the SoCs with e500 core have crypto hardware accelerators. How
> does this compare?

May sound stupid but I failed to activate the device in my TP-Link WDR4900.
The whole story here: https://community.freescale.com/message/475816

Markus

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] i2c/mpc: Fix ISR return value

2015-01-30 Thread Danielle Costantino
I have been using the driver with this modification for the past 6 months
and it has been stable in an industrial environment.I had made a few other
changes that also improve reliability (using ppc in_8 and out_8 and eieio
barriers to ensure in-order execution. This lets you remove the unneeded
double read of the status register. I also added a more robust recovery
function to handle force of bus master-ship, and clearing the arb lost
interrupt that is generated. currently this can cause the isr to trigger
and cause superfluous interrupts. I have not posted this patch because of
the extensive changes,

I will ack this patch.

On Fri, Jan 30, 2015 at 2:24 AM, Amit Tomar 
wrote:

> ISR should not return IRQ_HANDLED for not handling anything.
> This patch fixes the return value of ISR for the same case.
>
>
> Signed-off-by: Amit Singh Tomar 
> ---
> drivers/i2c/busses/i2c-mpc.c |3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/i2c/busses/i2c-mpc.c b/drivers/i2c/busses/i2c-mpc.c
> index 0edf630..7a3136f 100644
> --- a/drivers/i2c/busses/i2c-mpc.c
> +++ b/drivers/i2c/busses/i2c-mpc.c
> @@ -95,8 +95,9 @@ static irqreturn_t mpc_i2c_isr(int irq, void *dev_id)
> i2c->interrupt = readb(i2c->base + MPC_I2C_SR);
> writeb(0, i2c->base + MPC_I2C_SR);
> wake_up(&i2c->queue);
> +   return IRQ_HANDLED;
> }
> -  return IRQ_HANDLED;
> +  return IRQ_NONE;
> }
>
> /* Sometimes 9th clock pulse isn't generated, and slave doesn't release
> --
> 1.7.9.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
- Danielle Costantino
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/mm: bail out early when flushing TLB page

2015-01-30 Thread Arseny Solokha
MMU_NO_CONTEXT is conditionally defined as 0 or (unsigned int)-1. However,
in __flush_tlb_page() a corresponding variable is only tested for open
coded 0, which can cause NULL pointer dereference if `mm' argument was
legitimately passed as such.

Bail out early in case the first argument is NULL, thus eliminate confusion
between different values of MMU_NO_CONTEXT and avoid disabling and then
re-enabling preemption unnecessarily.

Signed-off-by: Arseny Solokha 
---
 arch/powerpc/mm/tlb_nohash.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c
index f38ea4d..ab0616b 100644
--- a/arch/powerpc/mm/tlb_nohash.c
+++ b/arch/powerpc/mm/tlb_nohash.c
@@ -284,8 +284,11 @@ void __flush_tlb_page(struct mm_struct *mm, unsigned long 
vmaddr,
struct cpumask *cpu_mask;
unsigned int pid;
 
+   if (unlikely(!mm))
+   return;
+
preempt_disable();
-   pid = mm ? mm->context.id : 0;
+   pid = mm->context.id;
if (unlikely(pid == MMU_NO_CONTEXT))
goto bail;
cpu_mask = mm_cpumask(mm);
-- 
2.2.2

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v5, 5/6] powerpc/mpc85xx: Add FSL QorIQ DPAA BMan support to device tree(s)

2015-01-30 Thread Emil Medve
Hello Scott,


On 01/29/2015 11:03 PM, Scott Wood wrote:
> On Mon, Dec 08, 2014 at 04:29:20AM -0600, Emil Medve wrote:
>> From: Kumar Gala 
>>
>> Change-Id: If643fa5ba0a903aef8f5056a2c90ebecc995b760
>> Signed-off-by: Kumar Gala 
>> Signed-off-by: Geoff Thorpe 
>> Signed-off-by: Hai-Ying Wang 
>> Signed-off-by: Chunhe Lan 
>> Signed-off-by: Poonam Aggrwal 
>> [Emil Medve: Sync with the upstream binding]
>> Signed-off-by: Emil Medve 
> 
> Doesn't apply cleanly
> 
>> @@ -408,6 +415,8 @@ crypto: crypto@30 {
>>  fsl,iommu-parent = <&pamu1>;
>>  };
>>  
>> +/include/ "qoriq-bman1.dtsi"
>> +
>>  /include/ "qoriq-fman-0.dtsi"
>>  /include/ "qoriq-fman-0-1g-0.dtsi"
>>  /include/ "qoriq-fman-0-1g-1.dtsi"
> 
> What tree did you base these patches on?  There's no fman in the upstream
> device trees yet (just a binding).

They were based on this patch series:
http://patchwork.ozlabs.org/patch/370866. Will re-send


Cheers,
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 0/3] SHA256 for PPC/SPE

2015-01-30 Thread Markus Stockhausen
[PATCH v2 0/3] SHA256 for PPC/SPE

The following patches add support for SIMD accelerated SHA256 
calculation on PPC processors with SPE instruction set. The 
implementation takes care of the following constraints:

- independant of processor endianess
- save SPE registers for interrupt context compatibility
- disable preemtion only for short intervals

Performance numbers from insmod tcrypt sec=3 mode=304 taken
on e500v2 800 MHz (TP Link WDR4900)

dataper generic this patch  speedup  cycles
length  update  bytes/sec   bytes/sec   factor   per byte
--  --  --  --  ---  
16  16   5,558,336   8,348,272x1.50 95.82
64  16  10,730,602  14,972,789x1.39 53.43
64  64  12,841,621  19,268,885x1.50 41.52
   256  16  16,223,317  21,295,957x1.31 37.57
   256  64  21,135,957  30,941,696x1.46 25,86
   256 256  22,664,448  35,765,845x1.57 22,37
  1024  16  18,608,128  23,893,674x1.28 33.48
  1024 256  27,427,840  43,427,498x1.58 18.42
  10241024  28,064,768  45,659,136x1.62 17.52
  2048  16  19,054,592  24,425,130x1.28 32.75
  2048 256  28,435,797  45,087,402x1.58 17.74
  20481024  29,091,157  47,395,498x1.62 16.88
  20482048  29,225,642  47,756,629x1.63 16.75

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 1/3] SHA256 for PPC/SPE - assembler

2015-01-30 Thread Markus Stockhausen
[PATCH v2 1/3] SHA256 for PPC/SPE - assembler

This is the assembler code for SHA256 implementation with
the SIMD SPE instruction set. Although being only a 32 bit
architecture GPRs are extended to 64 bit presenting two
32 bit values. With the enhanced instruction set we can
operate on them in parallel. That helps reducing the time
to calculate W16-W64. For increasing performance even more 
the assembler function can compute hashes for more than 
one 64 byte input block. That saves a lot of register
saving/restoring

The state of the used SPE registers is preserved via the 
stack so we can run from interrupt context. There might 
be the case that we interrupt ourselves and push sensitive 
data from another context onto our stack. Clear this area
in the stack afterwards to avoid information leakage.

The code is endian independant.

v2 changes
- fix tabs/spaces
- save/restore SPE registers

Signed-off-by: Markus Stockhausen 

diff --git a/arch/powerpc/crypto/sha256-spe-asm.S 
b/arch/powerpc/crypto/sha256-spe-asm.S
new file mode 100644
index 000..a334af7
--- /dev/null
+++ b/arch/powerpc/crypto/sha256-spe-asm.S
@@ -0,0 +1,323 @@
+/*
+ * Fast SHA-256 implementation for SPE instruction set (PPC)
+ *
+ * This code makes use of the SPE SIMD instruction set as defined in
+ * http://cache.freescale.com/files/32bit/doc/ref_manual/SPEPIM.pdf
+ * Implementation is based on optimization guide notes from
+ * http://cache.freescale.com/files/32bit/doc/app_note/AN2665.pdf
+ *
+ * Copyright (c) 2015 Markus Stockhausen 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ */
+
+#include 
+#include 
+
+#define rHPr3  /* pointer to hash values in memory */
+#define rKPr24 /* pointer to round constants   */
+#define rWPr4  /* pointer to input data*/
+
+#define rH0r5  /* 8 32 bit hash values in 8 registers  */
+#define rH1r6
+#define rH2r7
+#define rH3r8
+#define rH4r9
+#define rH5r10
+#define rH6r11
+#define rH7r12
+
+#define rW0r14 /* 64 bit registers. 16 words in 8 registers*/
+#define rW1r15
+#define rW2r16
+#define rW3r17
+#define rW4r18
+#define rW5r19
+#define rW6r20
+#define rW7r21
+
+#define rT0r22 /* 64 bit temporaries   */
+#define rT1r23
+#define rT2r0  /* 32 bit temporaries   */
+#define rT3r25
+
+#define CMP_KN_LOOP
+#define CMP_KC_LOOP \
+   cmpwi   rT1,0;
+
+#define INITIALIZE \
+   stwur1,-128(r1);/* create stack frame   */ \
+   evstdw  r14,8(r1);  /* We must save non volatile*/ \
+   evstdw  r15,16(r1); /* registers. Take the chance   */ \
+   evstdw  r16,24(r1); /* and save the SPE part too*/ \
+   evstdw  r17,32(r1);\
+   evstdw  r18,40(r1);\
+   evstdw  r19,48(r1);\
+   evstdw  r20,56(r1);\
+   evstdw  r21,64(r1);\
+   evstdw  r22,72(r1);\
+   evstdw  r23,80(r1);\
+   stw r24,88(r1); /* save normal registers*/ \
+   stw r25,92(r1);
+
+
+#define FINALIZE \
+   evldw   r14,8(r1);  /* restore SPE registers*/ \
+   evldw   r15,16(r1);\
+   evldw   r16,24(r1);\
+   evldw   r17,32(r1);\
+   evldw   r18,40(r1);\
+   evldw   r19,48(r1);\
+   evldw   r20,56(r1);\
+   evldw   r21,64(r1);\
+   evldw   r22,72(r1);\
+   evldw   r23,80(r1);\
+   lwz r24,88(r1); /* restore normal registers */ \
+   lwz r25,92(r1);\
+   xor r0,r0,r0;  \
+   stw r0,8(r1);   /* Delete sensitive data*/ \
+   stw r0,16(r1);  /* that we might have pushed*/ \
+   stw r0,24(r1);  /* from other context that runs */ \
+   

[PATCH v2 2/3] SHA256 for PPC/SPE - glue

2015-01-30 Thread Markus Stockhausen
[PATCH v2 2/3] SHA256 for PPC/SPE - glue

Glue code for crypto infrastructure. Call the assembler
code where required. Disable preemption during calculation
and enable SPE instructions in the kernel prior to the 
call. Avoid to disable preemption for too long.

Take a little care about small input data. Kick out early
for input chunks < 64 bytes and replace memset for context
cleanup with simple loop. 

v2 changes
- use MODULE_ALIAS_CRYPTO
- use memzero_explicit()
- additional aliases
- do not check for interrupt context
- priority 300, as we are assembler & CPU optimized

Signed-off-by: Markus Stockhausen 

diff --git a/arch/powerpc/crypto/sha256_spe_glue.c 
b/arch/powerpc/crypto/sha256_spe_glue.c
new file mode 100644
index 000..f4a616f
--- /dev/null
+++ b/arch/powerpc/crypto/sha256_spe_glue.c
@@ -0,0 +1,275 @@
+/*
+ * Glue code for SHA-256 implementation for SPE instructions (PPC)
+ *
+ * Based on generic implementation. The assembler module takes care 
+ * about the SPE registers so it can run from interrupt context.
+ *
+ * Copyright (c) 2015 Markus Stockhausen 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * MAX_BYTES defines the number of bytes that are allowed to be processed
+ * between preempt_disable() and preempt_enable(). SHA256 takes ~2,000
+ * operations per 64 bytes. e500 cores can issue two arithmetic instructions
+ * per clock cycle using one 32/64 bit unit (SU1) and one 32 bit unit (SU2).
+ * Thus 1KB of input data will need an estimated maximum of 18,000 cycles.
+ * Headroom for cache misses included. Even with the low end model clocked
+ * at 667 MHz this equals to a critical time window of less than 27us.
+ *
+ */
+#define MAX_BYTES 1024
+
+extern void ppc_spe_sha256_transform(u32 *state, const u8 *src, u32 blocks);
+
+static void spe_begin(void)
+{
+   /* We just start SPE operations and will save SPE registers later. */
+   preempt_disable();
+   enable_kernel_spe();
+}
+
+static void spe_end(void)
+{
+   /* reenable preemption */
+   preempt_enable();
+}
+
+static inline void ppc_sha256_clear_context(struct sha256_state *sctx)
+{
+   int count = sizeof(struct sha256_state) >> 2;
+   u32 *ptr = (u32 *)sctx;
+
+   /* make sure we can clear the fast way */
+   BUILD_BUG_ON(sizeof(struct sha256_state) % 4);
+   do { *ptr++ = 0; } while (--count);
+}
+
+static int ppc_spe_sha256_init(struct shash_desc *desc)
+{
+   struct sha256_state *sctx = shash_desc_ctx(desc);
+
+   sctx->state[0] = SHA256_H0;
+   sctx->state[1] = SHA256_H1;
+   sctx->state[2] = SHA256_H2;
+   sctx->state[3] = SHA256_H3;
+   sctx->state[4] = SHA256_H4;
+   sctx->state[5] = SHA256_H5;
+   sctx->state[6] = SHA256_H6;
+   sctx->state[7] = SHA256_H7;
+   sctx->count = 0;
+
+   return 0;
+}
+
+static int ppc_spe_sha224_init(struct shash_desc *desc)
+{
+   struct sha256_state *sctx = shash_desc_ctx(desc);
+
+   sctx->state[0] = SHA224_H0;
+   sctx->state[1] = SHA224_H1;
+   sctx->state[2] = SHA224_H2;
+   sctx->state[3] = SHA224_H3;
+   sctx->state[4] = SHA224_H4;
+   sctx->state[5] = SHA224_H5;
+   sctx->state[6] = SHA224_H6;
+   sctx->state[7] = SHA224_H7;
+   sctx->count = 0;
+
+   return 0;
+}
+
+static int ppc_spe_sha256_update(struct shash_desc *desc, const u8 *data,
+   unsigned int len)
+{
+   struct sha256_state *sctx = shash_desc_ctx(desc);
+   const unsigned int offset = sctx->count & 0x3f;
+   const unsigned int avail = 64 - offset;
+   unsigned int bytes;
+   const u8 *src = data;
+
+   if (avail > len) {
+   sctx->count += len;
+   memcpy((char *)sctx->buf + offset, src, len);
+   return 0;
+   }
+
+   sctx->count += len;
+
+   if (offset) {
+   memcpy((char *)sctx->buf + offset, src, avail);
+
+   spe_begin();
+   ppc_spe_sha256_transform(sctx->state, (const u8 *)sctx->buf, 1);
+   spe_end();
+
+   len -= avail;
+   src += avail;
+   }
+
+   while (len > 63) {
+   /* cut input data into smaller blocks */
+   bytes = (len > MAX_BYTES) ? MAX_BYTES : len;
+   bytes = bytes & ~0x3f;
+
+   spe_begin();
+   ppc_spe_sha256_transform(sctx->state, src, bytes >> 6);
+   spe_end();
+
+   src += bytes;
+   len -= bytes;
+   };
+
+   memcpy((char *)sctx->buf, src, len);
+   return 0;
+}
+
+static int ppc_spe_sha256_final(struct shash_desc *desc, u8 *out)
+{
+   struct sha256_s

[PATCH v2 3/3] SHA256 for PPC/SPE - kernel config

2015-01-30 Thread Markus Stockhausen
[PATCH v2 3/3] SHA256 for PPC/SPE - kernel config

Integrate the module into the kernel config tree.

Signed-off-by: Markus Stockhausen 

diff --git a/arch/powerpc/crypto/Makefile b/arch/powerpc/crypto/Makefile
index 2926fb9..a07e763 100644
--- a/arch/powerpc/crypto/Makefile
+++ b/arch/powerpc/crypto/Makefile
@@ -5,5 +5,7 @@
 #
 
 obj-$(CONFIG_CRYPTO_SHA1_PPC) += sha1-powerpc.o
+obj-$(CONFIG_CRYPTO_SHA256_PPC_SPE) += sha256-ppc-spe.o
 
 sha1-powerpc-y := sha1-powerpc-asm.o sha1.o
+sha256-ppc-spe-y := sha256-spe-asm.o sha256_spe_glue.o
diff --git a/crypto/Kconfig b/crypto/Kconfig
index 87bbc9c..86d35be 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -601,6 +601,15 @@ config CRYPTO_SHA256
  This code also includes SHA-224, a 224 bit hash with 112 bits
  of security against collision attacks.
 
+config CRYPTO_SHA256_PPC_SPE
+   tristate "SHA224 and SHA256 digest algorithm (PPC SPE)"
+   depends on PPC && SPE
+   select CRYPTO_SHA256
+   select CRYPTO_HASH
+   help
+ SHA224 and SHA256 secure hash standard (DFIPS 180-2)
+ implemented using powerpc SPE SIMD instruction set.
+
 config CRYPTO_SHA256_SPARC64
tristate "SHA224 and SHA256 digest algorithm (SPARC64)"
depends on SPARC64

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v7 0/4] Add support for parametrized events

2015-01-30 Thread Arnaldo Carvalho de Melo
Em Thu, Jan 29, 2015 at 03:28:43PM +1100, Michael Ellerman escreveu:
> On Mon, 2015-01-26 at 17:43 -0800, Sukadev Bhattiprolu wrote:
> > Description of "event parameters" from the documentation patch:
> > 
> > Cody P Schafer (6):
> >   perf: provide sysfs_show for struct perf_pmu_events_attr
> >   perf: add PMU_EVENT_ATTR_STRING() helper
> >   powerpc/perf/hv-24x7: parse catalog and populate sysfs with events
> >   powerpc/perf/{hv-gpci, hv-common}: generate requests with counters
> > annotated
> >   powerpc/perf/hv-gpci: add the remaining gpci requests
> >   powerpc/perf/hv-24x7: Document sysfs event description entries
> > 
> > Sukadev Bhattiprolu (1):
> >   perf: define EVENT_DEFINE_RANGE_FORMAT_LITE helper
> 
> 
> Hi Sukadev,
> 
> I realise Cody wrote most of these and you are just getting them merged, but
> they still need to be Signed-off-by you. Most of them aren't.
> 
> So please resend with them all signed off by you.
> 
> While you're at it, please drop all the CC lines, and move the changelog
> annotations below the --- line so they are dropped when I apply them.
> 
> Also add Jiri's ack to the first two patches.

I'm ok taking Jiri's Ack for these and merge them, Suka, may I add your
signed-off-by on those? (question for the record...).

- Arnaldo
 
> You can probably trim the CC list when you repost, I think everyone's seen 
> this
> series enough times.
> 
> cheers
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 1/3] powerpc/nvram: move generic code for nvram and pstore

2015-01-30 Thread Hari Bathini
With minor checks, we can move most of the code for nvram
under pseries to a common place to be re-used by other
powerpc platforms like powernv. This patch moves such
common code to arch/powerpc/kernel/nvram_64.c file.

Signed-off-by: Hari Bathini 
---
 arch/powerpc/include/asm/nvram.h   |   50 ++
 arch/powerpc/include/asm/rtas.h|4 
 arch/powerpc/kernel/nvram_64.c |  656 
 arch/powerpc/platforms/pseries/nvram.c |  665 
 4 files changed, 714 insertions(+), 661 deletions(-)

diff --git a/arch/powerpc/include/asm/nvram.h b/arch/powerpc/include/asm/nvram.h
index b0fe0fe..09a518b 100644
--- a/arch/powerpc/include/asm/nvram.h
+++ b/arch/powerpc/include/asm/nvram.h
@@ -9,12 +9,43 @@
 #ifndef _ASM_POWERPC_NVRAM_H
 #define _ASM_POWERPC_NVRAM_H
 
-
+#include 
 #include 
 #include 
 #include 
 
+/*
+ * Set oops header version to distinguish between old and new format header.
+ * lnx,oops-log partition max size is 4000, header version > 4000 will
+ * help in identifying new header.
+ */
+#define OOPS_HDR_VERSION 5000
+
+struct err_log_info {
+   __be32 error_type;
+   __be32 seq_num;
+};
+
+struct nvram_os_partition {
+   const char *name;
+   int req_size;   /* desired size, in bytes */
+   int min_size;   /* minimum acceptable size (0 means req_size) */
+   long size;  /* size of data portion (excluding err_log_info) */
+   long index; /* offset of data portion of partition */
+   bool os_partition; /* partition initialized by OS, not FW */
+};
+
+struct oops_log_info {
+   __be16 version;
+   __be16 report_length;
+   __be64 timestamp;
+} __attribute__((packed));
+
+extern struct nvram_os_partition oops_log_partition;
+
 #ifdef CONFIG_PPC_PSERIES
+extern struct nvram_os_partition rtas_log_partition;
+
 extern int nvram_write_error_log(char * buff, int length,
 unsigned int err_type, unsigned int 
err_seq);
 extern int nvram_read_error_log(char * buff, int length,
@@ -50,6 +81,23 @@ extern void  pmac_xpram_write(int xpaddr, u8 data);
 /* Synchronize NVRAM */
 extern voidnvram_sync(void);
 
+/* Initialize NVRAM OS partition */
+extern int __init nvram_init_os_partition(struct nvram_os_partition *part);
+
+/* Initialize NVRAM oops partition */
+extern void __init nvram_init_oops_partition(int rtas_partition_exists);
+
+/* Read a NVRAM partition */
+extern int nvram_read_partition(struct nvram_os_partition *part, char *buff,
+   int length, unsigned int *err_type,
+   unsigned int *error_log_cnt);
+
+/* Write to NVRAM OS partition */
+extern int nvram_write_os_partition(struct nvram_os_partition *part,
+   char *buff, int length,
+   unsigned int err_type,
+   unsigned int error_log_cnt);
+
 /* Determine NVRAM size */
 extern ssize_t nvram_get_size(void);
 
diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index b390f55..123d7ff 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -343,8 +343,12 @@ extern int early_init_dt_scan_rtas(unsigned long node,
 extern void pSeries_log_error(char *buf, unsigned int err_type, int fatal);
 
 #ifdef CONFIG_PPC_PSERIES
+extern unsigned long last_rtas_event;
+extern int clobbering_unread_rtas_event(void);
 extern int pseries_devicetree_update(s32 scope);
 extern void post_mobility_fixup(void);
+#else
+static inline int clobbering_unread_rtas_event(void) { return 0; }
 #endif
 
 #ifdef CONFIG_PPC_RTAS_DAEMON
diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index 34f7c9b..42e5c6a 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -26,6 +26,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -54,6 +57,659 @@ struct nvram_partition {
 
 static LIST_HEAD(nvram_partitions);
 
+#ifdef CONFIG_PPC_PSERIES
+struct nvram_os_partition rtas_log_partition = {
+   .name = "ibm,rtas-log",
+   .req_size = 2079,
+   .min_size = 1055,
+   .index = -1,
+   .os_partition = true
+};
+#endif
+
+struct nvram_os_partition oops_log_partition = {
+   .name = "lnx,oops-log",
+   .req_size = 4000,
+   .min_size = 2000,
+   .index = -1,
+   .os_partition = true
+};
+
+static const char *nvram_os_partitions[] = {
+#ifdef CONFIG_PPC_PSERIES
+   "ibm,rtas-log",
+#endif
+   "lnx,oops-log",
+   NULL
+};
+
+static void oops_to_nvram(struct kmsg_dumper *dumper,
+ enum kmsg_dump_reason reason);
+
+static struct kmsg_dumper nvram_kmsg_dumper = {
+   .dump = oops_to_nvram
+};
+
+/*
+ * For capturing and compressing an oops or panic report...
+
+ * big_oops_buf[] holds the uncompressed text we're capturing.
+ *
+ * oops_buf[] holds the comp

[PATCH v4 0/3] powerpc/pstore: Add pstore support for nvram partitions

2015-01-30 Thread Hari Bathini
This patch series adds pstore support on powernv platform to
read different nvram partitions and write compressed data to
oops-log nvram partition. As pseries platform already has
pstore support, this series moves most of the common code
for pseries and powernv platforms to a common file. Tested
the patches successfully on both pseries and powernv
platforms. Also, tested the patches successfully, on a kernel
compiled with both CONFIG_PPC_PSERIES=y & CONFIG_PPC_POWERNV=y.

Changes from v3:
1. Updated the changelog
2. Resolved compile issues with !CONFIG_PPC_PSERIES

---

Hari Bathini (3):
  powerpc/nvram: move generic code for nvram and pstore
  pstore: Add pstore type id for PPC64 opal nvram partition
  pstore: add pstore support on powernv


 arch/powerpc/include/asm/nvram.h|   50 ++
 arch/powerpc/include/asm/rtas.h |4 
 arch/powerpc/kernel/nvram_64.c  |  677 +++
 arch/powerpc/platforms/powernv/opal-nvram.c |   10 
 arch/powerpc/platforms/pseries/nvram.c  |  665 ---
 fs/pstore/inode.c   |3 
 include/linux/pstore.h  |1 
 7 files changed, 749 insertions(+), 661 deletions(-)

--
- Hari

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 2/3] pstore: Add pstore type id for PPC64 opal nvram partition

2015-01-30 Thread Hari Bathini
This patch adds a new PPC64 partition type to be used for opal
specific nvram partition. A new partition type is needed as none
of the existing type matches this partition type.

Signed-off-by: Hari Bathini 
Cc: Anton Vorontsov 
Cc: Colin Cross 
Cc: Kees Cook 
Cc: Tony Luck 
---
 fs/pstore/inode.c  |3 +++
 include/linux/pstore.h |1 +
 2 files changed, 4 insertions(+)

diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 5041660..8e0c009 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -359,6 +359,9 @@ int pstore_mkfile(enum pstore_type_id type, char *psname, 
u64 id, int count,
case PSTORE_TYPE_PPC_COMMON:
sprintf(name, "powerpc-common-%s-%lld", psname, id);
break;
+   case PSTORE_TYPE_PPC_OPAL:
+   sprintf(name, "powerpc-opal-%s-%lld", psname, id);
+   break;
case PSTORE_TYPE_UNKNOWN:
sprintf(name, "unknown-%s-%lld", psname, id);
break;
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index ece0c6b..af44980 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -39,6 +39,7 @@ enum pstore_type_id {
PSTORE_TYPE_PPC_RTAS= 4,
PSTORE_TYPE_PPC_OF  = 5,
PSTORE_TYPE_PPC_COMMON  = 6,
+   PSTORE_TYPE_PPC_OPAL= 7,
PSTORE_TYPE_UNKNOWN = 255
 };
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 3/3] pstore: add pstore support on powernv

2015-01-30 Thread Hari Bathini
This patch extends pstore, a generic interface to platform dependent
persistent storage, support for powernv  platform to capture certain
useful information, during dying moments. Such support is already in
place for  pseries platform. This patch re-uses most of that code.

It is a common practice to compile kernels with both CONFIG_PPC_PSERIES=y
and CONFIG_PPC_POWERNV=y. The code in nvram_init_oops_partition() routine
still works as intended, as the caller is platform specific code which
passes the appropriate value for "rtas_partition_exists" parameter.
In all other places, where CONFIG_PPC_PSERIES or CONFIG_PPC_POWERNV
flag is used in this patchset, it is to reduce the kernel size in cases
where this flag is not set and doesn't have any impact logic wise.

Signed-off-by: Hari Bathini 
Cc: Anton Vorontsov 
Cc: Colin Cross 
Cc: Kees Cook 
Cc: Tony Luck 
---
 arch/powerpc/kernel/nvram_64.c  |   25 +++--
 arch/powerpc/platforms/powernv/opal-nvram.c |   10 ++
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index 42e5c6a..293da88 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -127,6 +127,14 @@ static size_t oops_data_sz;
 static struct z_stream_s stream;
 
 #ifdef CONFIG_PSTORE
+#ifdef CONFIG_PPC_POWERNV
+static struct nvram_os_partition skiboot_partition = {
+   .name = "ibm,skiboot",
+   .index = -1,
+   .os_partition = false
+};
+#endif
+
 #ifdef CONFIG_PPC_PSERIES
 static struct nvram_os_partition of_config_partition = {
.name = "of-config",
@@ -477,6 +485,16 @@ static ssize_t nvram_pstore_read(u64 *id, enum 
pstore_type_id *type,
time->tv_nsec = 0;
break;
 #endif
+#ifdef CONFIG_PPC_POWERNV
+   case PSTORE_TYPE_PPC_OPAL:
+   sig = NVRAM_SIG_FW;
+   part = &skiboot_partition;
+   *type = PSTORE_TYPE_PPC_OPAL;
+   *id = PSTORE_TYPE_PPC_OPAL;
+   time->tv_sec = 0;
+   time->tv_nsec = 0;
+   break;
+#endif
default:
return 0;
}
@@ -552,8 +570,11 @@ static int nvram_pstore_init(void)
 {
int rc = 0;
 
-   nvram_type_ids[2] = PSTORE_TYPE_PPC_RTAS;
-   nvram_type_ids[3] = PSTORE_TYPE_PPC_OF;
+   if (machine_is(pseries)) {
+   nvram_type_ids[2] = PSTORE_TYPE_PPC_RTAS;
+   nvram_type_ids[3] = PSTORE_TYPE_PPC_OF;
+   } else
+   nvram_type_ids[2] = PSTORE_TYPE_PPC_OPAL;
 
nvram_pstore_info.buf = oops_data;
nvram_pstore_info.bufsize = oops_data_sz;
diff --git a/arch/powerpc/platforms/powernv/opal-nvram.c 
b/arch/powerpc/platforms/powernv/opal-nvram.c
index f9896fd..9db4398 100644
--- a/arch/powerpc/platforms/powernv/opal-nvram.c
+++ b/arch/powerpc/platforms/powernv/opal-nvram.c
@@ -16,6 +16,7 @@
 #include 
 
 #include 
+#include 
 #include 
 
 static unsigned int nvram_size;
@@ -62,6 +63,15 @@ static ssize_t opal_nvram_write(char *buf, size_t count, 
loff_t *index)
return count;
 }
 
+static int __init opal_nvram_init_log_partitions(void)
+{
+   /* Scan nvram for partitions */
+   nvram_scan_partitions();
+   nvram_init_oops_partition(0);
+   return 0;
+}
+machine_arch_initcall(powernv, opal_nvram_init_log_partitions);
+
 void __init opal_nvram_init(void)
 {
struct device_node *np;

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 1/3] powerpc/nvram: move generic code for nvram and pstore

2015-01-30 Thread Arnd Bergmann
On Friday 30 January 2015 20:44:00 Hari Bathini wrote:
> With minor checks, we can move most of the code for nvram
> under pseries to a common place to be re-used by other
> powerpc platforms like powernv. This patch moves such
> common code to arch/powerpc/kernel/nvram_64.c file.
> 
> Signed-off-by: Hari Bathini 

Can you make this y2038-safe in the process, possibly as a
follow-up patch?

> +extern unsigned long last_rtas_event;

time64_t

> + }
> + oops_hdr->version = cpu_to_be16(OOPS_HDR_VERSION);
> + oops_hdr->report_length = cpu_to_be16(zipped_len);
> + oops_hdr->timestamp = cpu_to_be64(get_seconds());
> + return 0;

ktime_get_real_seconds()

> +static ssize_t nvram_pstore_read(u64 *id, enum pstore_type_id *type,
> + int *count, struct timespec *time, char **buf,
> + bool *compressed, struct pstore_info *psi)

This has to remain timespec for now but can later be changed to timespec64
when the API gets changed.

> + oops_hdr->version = cpu_to_be16(OOPS_HDR_VERSION);
> + oops_hdr->report_length = cpu_to_be16(text_len);
> + oops_hdr->timestamp = cpu_to_be64(get_seconds());

ktime_get_real_seconds()

Arnd
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: AW: SPE & Interrupt context (was how to make use of SPE instructions)

2015-01-30 Thread Scott Wood
On Fri, 2015-01-30 at 11:41 +0100, Gabriel Paubert wrote:
> By the way, I wonder where the SAVE_EVR/REST_EVR macros are used. I only
> see the definitions, no use in a 3.18 source tree.

SAVE_EVR is used by SAVE_2EVRs, which is used by SAVE_4EVRS, etc.

The 32EVRS version is used in load_up_spe() and kvm_save_guest_spe().

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 0/4] Add support for parametrized events

2015-01-30 Thread Sukadev Bhattiprolu
Description of "event parameters" from the documentation patch:

Event parameters are a basic way for partial events to be specified in
sysfs with per-event names given to the fields that need to be filled in
when using a particular event.

It is intended for supporting cases where the single 'cpu' parameter is
insufficient. For example, POWER 8 has events for physical
sockets/cores/cpus that are accessible from with virtual machines. To
keep using the single 'cpu' parameter we'd need to perform a mapping
between Linux's cpus and the physical machine's cpus (in this case
Linux is running under a hypervisor). This isn't possible because
bindings between our cpus and physical cpus may not be fixed, and we
probably won't have a "cpu" on each physical cpu.

Description of the sysfs contents when events are parameterized (copied from an
included patch):

Examples:

domain=0x1,offset=0x8,core=?

In the case of the last example, a value replacing "?" would need
to be provided by the user selecting the particular event. This is
referred to as "event parameterization".

Notes on how perf-list displays parameterized events

PARAMETERIZED EVENTS


Some pmu events listed by 'perf list' will be displayed with '=?' in
them. For example:

  hv_24x7/HPM_THREAD_NAP_CCYC__PHYS_CORE,core=?/

This means that when provided as an event, a value for ? must also
be supplied. For example:

  perf stat  -e \
'hv_24x7/HPM_THREAD_NAP_CCYC__PHYS_CORE,core=2' ...


Cody P Schafer (6):
  perf: provide sysfs_show for struct perf_pmu_events_attr
  perf: add PMU_EVENT_ATTR_STRING() helper
  powerpc/perf/hv-24x7: parse catalog and populate sysfs with events
  powerpc/perf/{hv-gpci, hv-common}: generate requests with counters
annotated
  powerpc/perf/hv-gpci: add the remaining gpci requests
  powerpc/perf/hv-24x7: Document sysfs event description entries

Sukadev Bhattiprolu (1):
  perf: define EVENT_DEFINE_RANGE_FORMAT_LITE helper

 .../testing/sysfs-bus-event_source-devices-hv_24x7 |  22 +
 arch/powerpc/perf/hv-24x7-catalog.h|  25 +
 arch/powerpc/perf/hv-24x7-domains.h|  28 +
 arch/powerpc/perf/hv-24x7.c| 793 -
 arch/powerpc/perf/hv-24x7.h|  12 +-
 arch/powerpc/perf/hv-common.c  |  10 +-
 arch/powerpc/perf/hv-common.h  |  10 +
 arch/powerpc/perf/hv-gpci-requests.h   | 261 +++
 arch/powerpc/perf/hv-gpci.c|  23 +
 arch/powerpc/perf/hv-gpci.h|  37 +-
 arch/powerpc/perf/req-gen/_begin.h |  13 +
 arch/powerpc/perf/req-gen/_clear.h |   5 +
 arch/powerpc/perf/req-gen/_end.h   |   4 +
 arch/powerpc/perf/req-gen/_request-begin.h |  15 +
 arch/powerpc/perf/req-gen/_request-end.h   |   8 +
 arch/powerpc/perf/req-gen/perf.h   | 155 
 include/linux/perf_event.h |  10 +
 kernel/events/core.c   |  12 +
 18 files changed, 1396 insertions(+), 47 deletions(-)
 create mode 100644 arch/powerpc/perf/hv-24x7-domains.h
 create mode 100644 arch/powerpc/perf/hv-gpci-requests.h
 create mode 100644 arch/powerpc/perf/req-gen/_begin.h
 create mode 100644 arch/powerpc/perf/req-gen/_clear.h
 create mode 100644 arch/powerpc/perf/req-gen/_end.h
 create mode 100644 arch/powerpc/perf/req-gen/_request-begin.h
 create mode 100644 arch/powerpc/perf/req-gen/_request-end.h
 create mode 100644 arch/powerpc/perf/req-gen/perf.h

-- 
1.8.3.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 2/7] perf: add PMU_EVENT_ATTR_STRING() helper

2015-01-30 Thread Sukadev Bhattiprolu
From: Cody P Schafer 

Helper for constructing static struct perf_pmu_events_attr s.

Signed-off-by: Cody P Schafer 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---
Changelog[v7]:
[Jiri Olsa] Initialize 'id' field in perf_pmu_events_attr also.

 include/linux/perf_event.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 58f59bd..1d36314 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -906,6 +906,13 @@ static struct perf_pmu_events_attr _var = {
\
.id   =  _id,   \
 };
 
+#define PMU_EVENT_ATTR_STRING(_name, _var, _str)   \
+static struct perf_pmu_events_attr _var = {\
+   .attr   = __ATTR(_name, 0444, perf_event_sysfs_show, NULL), \
+   .id = 0,\
+   .event_str  = _str, \
+};
+
 #define PMU_FORMAT_ATTR(_name, _format)
\
 static ssize_t \
 _name##_show(struct device *dev,   \
-- 
1.8.3.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 1/7] perf: provide sysfs_show for struct perf_pmu_events_attr

2015-01-30 Thread Sukadev Bhattiprolu
From: Cody P Schafer 

(struct perf_pmu_events_attr) is defined in include/linux/perf_event.h,
but the only "show" for it is in x86 and contains x86 specific stuff.

Make a generic one for those of us who are just using the event_str.

Signed-off-by: Cody P Schafer 
Signed-off-by: Sukadev Bhattiprolu 
Acked-by: Jiri Olsa 
---
Changelog[v7]: [Jiri Olsa] Add a check pmu_events->str for similarity
the RAPL use of sysfs_show().

 include/linux/perf_event.h |  3 +++
 kernel/events/core.c   | 12 
 2 files changed, 15 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 486e84c..58f59bd 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -897,6 +897,9 @@ struct perf_pmu_events_attr {
const char *event_str;
 };
 
+ssize_t perf_event_sysfs_show(struct device *dev, struct device_attribute 
*attr,
+ char *page);
+
 #define PMU_EVENT_ATTR(_name, _var, _id, _show)
\
 static struct perf_pmu_events_attr _var = {\
.attr = __ATTR(_name, 0444, _show, NULL),   \
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4c1ee7f..934687f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8276,6 +8276,18 @@ void __init perf_event_init(void)
 != 1024);
 }
 
+ssize_t perf_event_sysfs_show(struct device *dev, struct device_attribute 
*attr,
+ char *page)
+{
+   struct perf_pmu_events_attr *pmu_attr =
+   container_of(attr, struct perf_pmu_events_attr, attr);
+
+   if (pmu_attr->event_str)
+   return sprintf(page, "%s\n", pmu_attr->event_str);
+
+   return 0;
+}
+
 static int __init perf_event_sysfs_init(void)
 {
struct pmu *pmu;
-- 
1.8.3.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 4/7] powerpc/perf/hv-24x7: parse catalog and populate sysfs with events

2015-01-30 Thread Sukadev Bhattiprolu
From: Cody P Schafer 

Retrieves and parses the 24x7 catalog on POWER systems that supply it
(right now, only POWER 8). Events are exposed via sysfs in the standard
fashion, and are all parameterized.

$ cd /sys/bus/event_source/devices/hv_24x7/events

$ cat HPM_CS_FROM_L4_LDATA__PHYS_CORE
domain=0x2,offset=0xd58,core=?,lpar=0x0

$ cat HPM_TLBIE__VCPU_HOME_CHIP
domain=0x4,offset=0x358,vcpu=?,lpar=?

where user is required to specify values for the fields with '?' (like
core, vcpu, lpar above), when specifying the event with the perf tool.

Catalog is (at the moment) only parsed on boot. It needs re-parsing
when a some hypervisor events occur. At that point we'll also need to
prevent old events from continuing to function (counter that is passed
in via spare space in the config values?).

Signed-off-by: Cody P Schafer 
Signed-off-by: Sukadev Bhattiprolu 
---
Changelog[v7]
[Michael Ellerman] Fix a minor merge conflict.

Changelog[v6]
[Jiri Olsa, Sukadev Bhattiprolu] Replace 'starting_index' with
what it really means for the event i.e "core" or "vcpu" and use
a single '?' to identify "required" parameters.

Changelog[v5]
[Jiri Olsa, Peter Zijlstra] Prefix required parameters with '$'
to make it easy for user to recognize.

Changelog[v4]
[Sukadev Bhattiprolu] Use PHYS and VCPU in place of PHYSICAL and
VIRTUAL_PROCESSOR to shorten the names of the domains and hence,
events;

Changelog[v2]
[Joe Perches, David Laight] Use beNN_to_cpu() instead of guessing
the size from type.
Use kmem_cache_free() to free page allocated with kmem_cache_alloc().

 arch/powerpc/perf/hv-24x7-catalog.h |  25 ++
 arch/powerpc/perf/hv-24x7-domains.h |  28 ++
 arch/powerpc/perf/hv-24x7.c | 793 +++-
 arch/powerpc/perf/hv-24x7.h |  12 +-
 4 files changed, 841 insertions(+), 17 deletions(-)
 create mode 100644 arch/powerpc/perf/hv-24x7-domains.h

diff --git a/arch/powerpc/perf/hv-24x7-catalog.h 
b/arch/powerpc/perf/hv-24x7-catalog.h
index 21b19dd..69e2e1f 100644
--- a/arch/powerpc/perf/hv-24x7-catalog.h
+++ b/arch/powerpc/perf/hv-24x7-catalog.h
@@ -30,4 +30,29 @@ struct hv_24x7_catalog_page_0 {
__u8 reserved6[2];
 } __packed;
 
+struct hv_24x7_event_data {
+   __be16 length; /* in bytes, must be a multiple of 16 */
+   __u8 reserved1[2];
+   __u8 domain; /* Chip = 1, Core = 2 */
+   __u8 reserved2[1];
+   __be16 event_group_record_offs; /* in bytes, must be 8 byte aligned */
+   __be16 event_group_record_len; /* in bytes */
+
+   /* in bytes, offset from event_group_record */
+   __be16 event_counter_offs;
+
+   /* verified_state, unverified_state, caveat_state, broken_state, ... */
+   __be32 flags;
+
+   __be16 primary_group_ix;
+   __be16 group_count;
+   __be16 event_name_len;
+   __u8 remainder[];
+   /* __u8 event_name[event_name_len - 2]; */
+   /* __be16 event_description_len; */
+   /* __u8 event_desc[event_description_len - 2]; */
+   /* __be16 detailed_desc_len; */
+   /* __u8 detailed_desc[detailed_desc_len - 2]; */
+} __packed;
+
 #endif
diff --git a/arch/powerpc/perf/hv-24x7-domains.h 
b/arch/powerpc/perf/hv-24x7-domains.h
new file mode 100644
index 000..49c1efd
--- /dev/null
+++ b/arch/powerpc/perf/hv-24x7-domains.h
@@ -0,0 +1,28 @@
+
+/*
+ * DOMAIN(name, num, index_kind, is_physical)
+ *
+ * @name:  An all caps token, suitable for use in generating an enum
+ * member and appending to an event name in sysfs.
+ *
+ * @num:   The number corresponding to the domain as given in
+ * documentation. We assume the catalog domain and the hcall
+ * domain have the same numbering (so far they do), but this
+ * may need to be changed in the future.
+ *
+ * @index_kind: A stringifiable token describing the meaning of the index
+ * within the given domain. Must fit the parsing rules of the
+ * perf sysfs api.
+ *
+ * @is_physical: True if the domain is physical, false otherwise (if virtual).
+ *
+ * Note: The terms PHYS_CHIP, PHYS_CORE, VCPU correspond to physical chip,
+ *  physical core and virtual processor in 24x7 Counters specifications.
+ */
+
+DOMAIN(PHYS_CHIP, 0x01, chip, true)
+DOMAIN(PHYS_CORE, 0x02, core, true)
+DOMAIN(VCPU_HOME_CORE, 0x03, vcpu, false)
+DOMAIN(VCPU_HOME_CHIP, 0x04, vcpu, false)
+DOMAIN(VCPU_HOME_NODE, 0x05, vcpu, false)
+DOMAIN(VCPU_REMOTE_NODE, 0x06, vcpu, false)
diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index f162d0b..9445a82 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -13,16 +13,66 @@
 #define pr_fmt(fmt) "hv-24x7: " fmt
 
 #include 
+#include 
 #include 
 #include 
+#include 
+
 #include 
 #include 
 #include 
+#include 
 
 #include "hv-24x7.h"
 #include "hv-24x7-catalog.h"
 #include "hv-

[PATCH v7 3/7] perf: define EVENT_DEFINE_RANGE_FORMAT_LITE helper

2015-01-30 Thread Sukadev Bhattiprolu
Define a lite version of the EVENT_DEFINE_RANGE_FORMAT() that avoids
defining helper functions for the bit-field ranges.

Signed-off-by: Sukadev Bhattiprolu 
---
 arch/powerpc/perf/hv-common.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/perf/hv-common.h b/arch/powerpc/perf/hv-common.h
index 5d79cec..349aaba 100644
--- a/arch/powerpc/perf/hv-common.h
+++ b/arch/powerpc/perf/hv-common.h
@@ -20,6 +20,16 @@ unsigned long hv_perf_caps_get(struct hv_perf_caps *caps);
 PMU_FORMAT_ATTR(name, #attr_var ":" #bit_start "-" #bit_end);  \
 EVENT_DEFINE_RANGE(name, attr_var, bit_start, bit_end)
 
+/*
+ * The EVENT_DEFINE_RANGE_FORMAT() macro above includes helper functions
+ * for the fields (eg: event_get_starting_index()). For some fields we
+ * need the bit-range definition, but no the helper functions. Define a
+ * lite version of the above macro without the helpers and silence
+ * compiler warnings unused static functions.
+ */
+#define EVENT_DEFINE_RANGE_FORMAT_LITE(name, attr_var, bit_start, bit_end) \
+PMU_FORMAT_ATTR(name, #attr_var ":" #bit_start "-" #bit_end);
+
 #define EVENT_DEFINE_RANGE(name, attr_var, bit_start, bit_end) \
 static u64 event_get_##name##_max(void)
\
 {  \
-- 
1.8.3.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 7/7] powerpc/perf/hv-24x7: Document sysfs event description entries

2015-01-30 Thread Sukadev Bhattiprolu
From: Cody P Schafer 

Signed-off-by: Cody P Schafer 
Signed-off-by: Sukadev Bhattiprolu 
---
Changelog[v6]
Update Contact info to Linux on Power Developer list

 .../testing/sysfs-bus-event_source-devices-hv_24x7 | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
index 32f3f5f..f893337 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
@@ -21,3 +21,25 @@ Contact: Linux on PowerPC Developer List 

 Description:
Exposes the "version" field of the 24x7 catalog. This is also
extractable from the provided binary "catalog" sysfs entry.
+
+What:  /sys/bus/event_source/devices/hv_24x7/event_descs/
+Date:  February 2014
+Contact:   Linux on PowerPC Developer List 
+Description:
+   Provides the description of a particular event as provided by
+   the firmware. If firmware does not provide a description, no
+   file will be created.
+
+   Note that the event-name lacks the domain suffix appended for
+   events in the events/ dir.
+
+What:  
/sys/bus/event_source/devices/hv_24x7/event_long_descs/
+Date:  February 2014
+Contact:   Linux on PowerPC Developer List 
+Description:
+   Provides the "long" description of a particular event as
+   provided by the firmware. If firmware does not provide a
+   description, no file will be created.
+
+   Note that the event-name lacks the domain suffix appended for
+   events in the events/ dir.
-- 
1.8.3.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v7 6/7] powerpc/perf/hv-gpci: add the remaining gpci requests

2015-01-30 Thread Sukadev Bhattiprolu
From: Cody P Schafer 

Add the remaining gpci requests that contain counters suitable for use
by perf. Omit those that don't contain any counters (but note their
ommision).

Signed-off-by: Cody P Schafer 
Signed-off-by: Sukadev Bhattiprolu 
---
Changelog[v6]

[Jiri Olsa, Sukadev Bhattiprolu] Replace 'starting_index' with what
it really means for the event. Eg if starting_index refers to a
partition_id for an event, allow user to specify a value for
partition_id rather than 'starting_index'. Also use a =? to indicate
required parameters: eg partition_id=?

 arch/powerpc/perf/hv-gpci-requests.h | 187 ++-
 1 file changed, 186 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/hv-gpci-requests.h 
b/arch/powerpc/perf/hv-gpci-requests.h
index 8008544..acd1764 100644
--- a/arch/powerpc/perf/hv-gpci-requests.h
+++ b/arch/powerpc/perf/hv-gpci-requests.h
@@ -20,7 +20,9 @@
  *
  * - starting_index_kind is one of the following, depending on the event:
  *
- *   chip_id: hardware chip id or -1 for current hw chip
+ *   hw_chip_id: hardware chip id or -1 for current hw chip
+ *   partition_id
+ *   sibling_part_id,
  *   phys_processor_idx:
  *   0x: or -1, which means it is irrelavant for the event
  *
@@ -63,6 +65,33 @@ REQUEST(__count(0,   8,  
processor_time_in_timebase_cycles)
 )
 #include I(REQUEST_END)
 
+#define REQUEST_NAME 
entitled_capped_uncapped_donated_idle_timebase_by_partition
+#define REQUEST_NUM 0x20
+#define REQUEST_IDX_KIND "sibling_part_id=?"
+#include I(REQUEST_BEGIN)
+REQUEST(__field(0, 8,  partition_id)
+   __count(0x8,8,  entitled_cycles)
+   __count(0x10,   8,  consumed_capped_cycles)
+   __count(0x18,   8,  consumed_uncapped_cycles)
+   __count(0x20,   8,  cycles_donated)
+   __count(0x28,   8,  purr_idle_cycles)
+)
+#include I(REQUEST_END)
+
+/*
+ * Not available for counter_info_version >= 0x8, use
+ * run_instruction_cycles_by_partition(0x100) instead.
+ */
+#define REQUEST_NAME run_instructions_run_cycles_by_partition
+#define REQUEST_NUM 0x30
+#define REQUEST_IDX_KIND "sibling_part_id=?"
+#include I(REQUEST_BEGIN)
+REQUEST(__field(0, 8,  partition_id)
+   __count(0x8,8,  instructions_completed)
+   __count(0x10,   8,  cycles)
+)
+#include I(REQUEST_END)
+
 #define REQUEST_NAME system_performance_capabilities
 #define REQUEST_NUM 0x40
 #define REQUEST_IDX_KIND "starting_index=0x"
@@ -73,4 +102,160 @@ REQUEST(__field(0, 1,  perf_collect_privileged)
 )
 #include I(REQUEST_END)
 
+#define REQUEST_NAME processor_bus_utilization_abc_links
+#define REQUEST_NUM 0x50
+#define REQUEST_IDX_KIND "hw_chip_id=?"
+#include I(REQUEST_BEGIN)
+REQUEST(__field(0, 4,  hw_chip_id)
+   __array(0x4,0xC,reserved1)
+   __count(0x10,   8,  total_link_cycles)
+   __count(0x18,   8,  idle_cycles_for_a_link)
+   __count(0x20,   8,  idle_cycles_for_b_link)
+   __count(0x28,   8,  idle_cycles_for_c_link)
+   __array(0x30,   0x20,   reserved2)
+)
+#include I(REQUEST_END)
+
+#define REQUEST_NAME processor_bus_utilization_wxyz_links
+#define REQUEST_NUM 0x60
+#define REQUEST_IDX_KIND "hw_chip_id=?"
+#include I(REQUEST_BEGIN)
+REQUEST(__field(0, 4,  hw_chip_id)
+   __array(0x4,0xC,reserved1)
+   __count(0x10,   8,  total_link_cycles)
+   __count(0x18,   8,  idle_cycles_for_w_link)
+   __count(0x20,   8,  idle_cycles_for_x_link)
+   __count(0x28,   8,  idle_cycles_for_y_link)
+   __count(0x30,   8,  idle_cycles_for_z_link)
+   __array(0x38,   0x28,   reserved2)
+)
+#include I(REQUEST_END)
+
+#define REQUEST_NAME processor_bus_utilization_gx_links
+#define REQUEST_NUM 0x70
+#define REQUEST_IDX_KIND "hw_chip_id=?"
+#include I(REQUEST_BEGIN)
+REQUEST(__field(0, 4,  hw_chip_id)
+   __array(0x4,0xC,reserved1)
+   __count(0x10,   8,  gx0_in_address_cycles)
+   __count(0x18,   8,  gx0_in_data_cycles)
+   __count(0x20,   8,  gx0_in_retries)
+   __count(0x28,   8,  gx0_in_bus_cycles)
+   __count(0x30,   8,  gx0_in_cycles_total)
+   __count(0x38,   8,  gx0_out_address_cycles)
+   __count(0x40,   8,  gx0_out_data_cycles)
+   __count(0x48,   8,  gx0_out_retries)
+   __count(0x50,   8,  gx0_out_bus_cycles)
+   __count(0x58,   8,  gx0_out_cycles_total)
+   __count(0x60,   8,  gx1_in_address_cycles)
+   __count(0x68,   8,  gx1_in_data_cycles)
+   __count(0x70,   8,  gx1_in_retries)
+   __count(0x78,   8,  gx1_in_bus_cycles)
+   __count(0x80,   8,  gx1_in_cycles_total)
+   __count(0x88,   8,  gx1_out_address_cycles)
+   __count(0x90,   8,  gx1_out_data_cycles)
+   __count(0x98,   8,  gx1_out_retries)
+   __count(0xA0,   8,  gx1_out_bus_cycles)
+   __c

[PATCH v7 5/7] powerpc/perf/{hv-gpci, hv-common}: generate requests with counters annotated

2015-01-30 Thread Sukadev Bhattiprolu
From: Cody P Schafer 

This adds (in req-gen/) a framework for defining gpci counter requests.
It uses macro magic similar to ftrace.

Also convert the existing hv-gpci request structures and enum values to
use the new framework (and adjust old users of the structs and enum
values to cope with changes in naming).

In exchange for this macro disaster, we get autogenerated event listing
for GPCI in sysfs, build time field offset checking, and zero
duplication of information about GPCI requests.

Signed-off-by: Cody P Schafer 
Signed-off-by: Sukadev Bhattiprolu 
---
Changelog[v6]
Replace 'starting_index' with what it really means for the
event. Eg if starting_index refers to a partition_id for an
event, allow user to specify a value for partition_id rather
than 'starting_index'. Also use a =? to indicate required
parameters: eg partition_id=?

 arch/powerpc/perf/hv-common.c  |  10 +-
 arch/powerpc/perf/hv-gpci-requests.h   |  76 ++
 arch/powerpc/perf/hv-gpci.c|  23 +
 arch/powerpc/perf/hv-gpci.h|  37 +++
 arch/powerpc/perf/req-gen/_begin.h |  13 +++
 arch/powerpc/perf/req-gen/_clear.h |   5 +
 arch/powerpc/perf/req-gen/_end.h   |   4 +
 arch/powerpc/perf/req-gen/_request-begin.h |  15 +++
 arch/powerpc/perf/req-gen/_request-end.h   |   8 ++
 arch/powerpc/perf/req-gen/perf.h   | 155 +
 10 files changed, 316 insertions(+), 30 deletions(-)
 create mode 100644 arch/powerpc/perf/hv-gpci-requests.h
 create mode 100644 arch/powerpc/perf/req-gen/_begin.h
 create mode 100644 arch/powerpc/perf/req-gen/_clear.h
 create mode 100644 arch/powerpc/perf/req-gen/_end.h
 create mode 100644 arch/powerpc/perf/req-gen/_request-begin.h
 create mode 100644 arch/powerpc/perf/req-gen/_request-end.h
 create mode 100644 arch/powerpc/perf/req-gen/perf.h

diff --git a/arch/powerpc/perf/hv-common.c b/arch/powerpc/perf/hv-common.c
index 47e02b3..7dce8f10 100644
--- a/arch/powerpc/perf/hv-common.c
+++ b/arch/powerpc/perf/hv-common.c
@@ -9,13 +9,13 @@ unsigned long hv_perf_caps_get(struct hv_perf_caps *caps)
unsigned long r;
struct p {
struct hv_get_perf_counter_info_params params;
-   struct cv_system_performance_capabilities caps;
+   struct hv_gpci_system_performance_capabilities caps;
} __packed __aligned(sizeof(uint64_t));
 
struct p arg = {
.params = {
.counter_request = cpu_to_be32(
-   CIR_SYSTEM_PERFORMANCE_CAPABILITIES),
+   HV_GPCI_system_performance_capabilities),
.starting_index = cpu_to_be32(-1),
.counter_info_version_in = 0,
}
@@ -31,9 +31,9 @@ unsigned long hv_perf_caps_get(struct hv_perf_caps *caps)
 
caps->version = arg.params.counter_info_version_out;
caps->collect_privileged = !!arg.caps.perf_collect_privileged;
-   caps->ga = !!(arg.caps.capability_mask & CV_CM_GA);
-   caps->expanded = !!(arg.caps.capability_mask & CV_CM_EXPANDED);
-   caps->lab = !!(arg.caps.capability_mask & CV_CM_LAB);
+   caps->ga = !!(arg.caps.capability_mask & HV_GPCI_CM_GA);
+   caps->expanded = !!(arg.caps.capability_mask & HV_GPCI_CM_EXPANDED);
+   caps->lab = !!(arg.caps.capability_mask & HV_GPCI_CM_LAB);
 
return r;
 }
diff --git a/arch/powerpc/perf/hv-gpci-requests.h 
b/arch/powerpc/perf/hv-gpci-requests.h
new file mode 100644
index 000..8008544
--- /dev/null
+++ b/arch/powerpc/perf/hv-gpci-requests.h
@@ -0,0 +1,76 @@
+
+#include "req-gen/_begin.h"
+
+/*
+ * Based on the document "getPerfCountInfo v1.07"
+ */
+
+/*
+ * #define REQUEST_NAME counter_request_name
+ * #define REQUEST_NUM r_num
+ * #define REQUEST_IDX_KIND starting_index_kind
+ * #include I(REQUEST_BEGIN)
+ * REQUEST(
+ * __field(...)
+ * __field(...)
+ * __array(...)
+ * __count(...)
+ * )
+ * #include I(REQUEST_END)
+ *
+ * - starting_index_kind is one of the following, depending on the event:
+ *
+ *   chip_id: hardware chip id or -1 for current hw chip
+ *   phys_processor_idx:
+ *   0x: or -1, which means it is irrelavant for the event
+ *
+ * __count(offset, bytes, name):
+ * a counter that should be exposed via perf
+ * __field(offset, bytes, name)
+ * a normal field
+ * __array(offset, bytes, name)
+ * an array of bytes
+ *
+ *
+ * @bytes for __count, and __field _must_ be a numeral token
+ * in decimal, not an expression and not in hex.
+ *
+ *
+ * TODO:
+ * - expose secondary index (if any counter ever uses it, only 0xA0
+ *   appears to use it right now, and it doesn't have any counters)
+ * - embed versioning info
+ * - include counter descriptions
+ */
+#define REQUEST_NAME dispatch_timebase_by_processor
+#define REQUEST_NUM 0x10
+#define REQUEST_ID

Re: [PATCH] powerpc/mm: bail out early when flushing TLB page

2015-01-30 Thread Scott Wood
On Fri, 2015-01-30 at 19:08 +0700, Arseny Solokha wrote:
> MMU_NO_CONTEXT is conditionally defined as 0 or (unsigned int)-1.

For nohash it is specifically -1.

>  However, in __flush_tlb_page() a corresponding variable is only tested
> for open coded 0, which can cause NULL pointer dereference if `mm'
> argument was legitimately passed as such.
> 
> Bail out early in case the first argument is NULL, thus eliminate confusion
> between different values of MMU_NO_CONTEXT and avoid disabling and then
> re-enabling preemption unnecessarily.

How did you notice this?  Did you see an oops, or was it code
inspection?  I'm wondering what codepath gets here with mm == NULL.

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V11 14/17] powerpc/powernv: Shift VF resource with an offset

2015-01-30 Thread Bjorn Helgaas
On Thu, Jan 15, 2015 at 10:28:04AM +0800, Wei Yang wrote:
> On PowrNV platform, resource position in M64 implies the PE# the resource
> belongs to. In some particular case, adjustment of a resource is necessary
> to locate it to a correct position in M64.
> 
> This patch introduces a function to shift the 'real' PF IOV BAR address
> according to an offset.
> 
> Signed-off-by: Wei Yang 
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c |   31 
> +
>  1 file changed, 31 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 8bad2b0..62bb2eb 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -749,6 +750,36 @@ static unsigned int pnv_ioda_dma_weight(struct pci_dev 
> *dev)
>   return 10;
>  }
>  
> +#ifdef CONFIG_PCI_IOV
> +static void pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
> +{
> + struct pci_dn *pdn = pci_get_pdn(dev);
> + int i;
> + struct resource *res;
> + resource_size_t size;
> +
> + if (!dev->is_physfn)
> + return;
> +
> + for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) {
> + res = &dev->resource[i];
> + if (!res->flags || !res->parent)
> + continue;
> +
> + if (!pnv_pci_is_mem_pref_64(res->flags))
> + continue;
> +
> + dev_info(&dev->dev, " Shifting VF BAR %pR to\n", res);
> + size = pci_iov_resource_size(dev, i);
> + res->start += size*offset;

It seems like you should adjust res->end, too.  Am I missing something?

And I'm not sure it's safe to move the resource here, because if we move it
outside the bounds of the parent, we'll corrupt the resource tree.  Maybe
we're safe for some reason here, but it requires more analysis than I've
done to prove it.

> +
> + dev_info(&dev->dev, " %pR\n", res);
> + pci_update_resource(dev, i);
> + }
> + pdn->max_vfs -= offset;
> +}
> +#endif /* CONFIG_PCI_IOV */
> +
>  #if 0
>  static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev)
>  {
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V4] tick/hotplug: Handover time related duties before cpu offline

2015-01-30 Thread Preeti U Murthy
These duties include do_timer to update jiffies and broadcast wakeups on those
platforms which do not have an external device to handle wakeup of cpus from 
deep
idle states. The handover of these duties is not robust against a cpu offline
operation today.

The do_timer duty is handed over in the CPU_DYING phase today to one of the 
online
cpus. This relies on the fact that *all* cpus participate in stop_machine phase.
But if this design is to change in the future, i.e. if all cpus are not
required to participate in stop_machine, the freshly nominated do_timer cpu
could be idle at the time of handover. In that case, unless its interrupted,
it will not wakeup to update jiffies and timekeeping will hang.

With regard to broadcast wakeups, today if the cpu handling broadcast of wakeups
goes offline, the job of broadcasting is handed over to another cpu in the 
CPU_DEAD
phase. The CPU_DEAD notifiers are run only after the offline cpu sets its state 
as
CPU_DEAD. Meanwhile, the kthread doing the offline is scheduled out while 
waiting for
this transition by queuing a timer. This is fatal because if the cpu on which
this kthread was running has no other work queued on it, it can re-enter deep
idle state, since it sees that a broadcast cpu still exists. However the 
broadcast
wakeup will never come since the cpu which was handling it is offline, and the 
cpu
on which the kthread doing the hotplug operation was running never wakes up to 
see
this because its in deep idle state.

Fix these issues by handing over the do_timer and broadcast wakeup duties just 
before
the offline cpu kills itself, to the cpu performing the hotplug operation. 
Since the
cpu performing the hotplug operation is up and running, it becomes aware of the 
handover
of do_timer duty and queues the broadcast timer upon itself so as to seamlessly
continue both these operations.

It fixes the bug reported here:
http://linuxppc.10917.n7.nabble.com/offlining-cpus-breakage-td88619.html

Signed-off-by: Preeti U Murthy 
---
Changes from V3: https://lkml.org/lkml/2015/1/20/236
1. Move handover of broadcast duty away from CPU_DYING phase to just before
the cpu kills itself.
2. Club the handover of timekeeping duty along with broadcast duty to make
timekeeping robust against hotplug.

 include/linux/tick.h |2 ++
 kernel/cpu.c |4 
 kernel/time/clockevents.c|4 
 kernel/time/tick-broadcast.c |4 +---
 kernel/time/tick-common.c|   12 +---
 kernel/time/tick-internal.h  |3 ++-
 6 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index eda850c..6634be5 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -78,6 +78,7 @@ struct tick_sched {
 extern void __init tick_init(void);
 extern int tick_is_oneshot_available(void);
 extern struct tick_device *tick_get_device(int cpu);
+extern void tick_handover_tk(int hcpu);
 
 # ifdef CONFIG_HIGH_RES_TIMERS
 extern int tick_init_highres(void);
@@ -120,6 +121,7 @@ static inline int tick_oneshot_mode_active(void) { return 
0; }
 #else /* CONFIG_GENERIC_CLOCKEVENTS */
 static inline void tick_init(void) { }
 static inline void tick_cancel_sched_timer(int cpu) { }
+static inline void tick_handover_tk(int hcpu) {}
 static inline void tick_clock_notify(void) { }
 static inline int tick_check_oneshot_change(int allow_nohz) { return 0; }
 static inline void tick_irq_enter(void) { }
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 5d22023..329ec59 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "smpboot.h"
@@ -421,6 +422,9 @@ static int __ref _cpu_down(unsigned int cpu, int 
tasks_frozen)
while (!idle_cpu(cpu))
cpu_relax();
 
+   /* Handover timekeeping and broadcast duties to the current cpu */
+   tick_handover_tk(cpu);
+
/* This actually kills the CPU. */
__cpu_die(cpu);
 
diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 5544990..2b10b13 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -566,10 +566,6 @@ int clockevents_notify(unsigned long reason, void *arg)
ret = tick_broadcast_oneshot_control(reason);
break;
 
-   case CLOCK_EVT_NOTIFY_CPU_DYING:
-   tick_handover_do_timer(arg);
-   break;
-
case CLOCK_EVT_NOTIFY_SUSPEND:
tick_suspend();
tick_suspend_broadcast();
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 066f0ec..4f59ede 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -669,7 +669,7 @@ static void broadcast_shutdown_local(struct 
clock_event_device *bc,
clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
 }
 
-static void broadcast_move_bc(int deadcpu)
+void tick_handover_broadcast(int deadcpu)
 {
struct clock_event_device *bc = tick_broadc

Re: [PATCH] powerpc/mm: bail out early when flushing TLB page

2015-01-30 Thread Arseny Solokha
> On Fri, 2015-01-30 at 19:08 +0700, Arseny Solokha wrote:
>> MMU_NO_CONTEXT is conditionally defined as 0 or (unsigned int)-1.
>
> For nohash it is specifically -1.

>>  However, in __flush_tlb_page() a corresponding variable is only tested
>> for open coded 0, which can cause NULL pointer dereference if `mm'
>> argument was legitimately passed as such.
>>
>> Bail out early in case the first argument is NULL, thus eliminate confusion
>> between different values of MMU_NO_CONTEXT and avoid disabling and then
>> re-enabling preemption unnecessarily.
>
> How did you notice this?  Did you see an oops, or was it code
> inspection?  I'm wondering what codepath gets here with mm == NULL.

Just a code inspection. It didn't seemed right at the first glance.

Arsény


>
> -Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev