Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Nicholas Piggin
On Fri, 05 Aug 2016 21:16:00 +0200
Arnd Bergmann  wrote:

> On Saturday, August 6, 2016 2:16:42 AM CEST Nicholas Piggin wrote:
> > > 
> > > diff --git a/include/asm-generic/vmlinux.lds.h 
> > > b/include/asm-generic/vmlinux.lds.h
> > > index 0ec807d69f18..7a3ad269fa23 100644
> > > --- a/include/asm-generic/vmlinux.lds.h
> > > +++ b/include/asm-generic/vmlinux.lds.h
> > > @@ -433,7 +433,7 @@
> > >   * during second ld run in second ld pass when generating System.map */
> > >  #define TEXT_TEXT\
> > >   ALIGN_FUNCTION();   \
> > > - *(.text.hot .text .text.fixup .text.unlikely)   \
> > > + *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
> > >   *(.ref.text)\
> > >   MEM_KEEP(init.text) \
> > >   MEM_KEEP(exit.text) \
> > > 
> > > 
> > > It also got much faster again, the link time for an allyesconfig
> > > kernel is now 18 minutes instead of 10 hours, but it's still
> > > much worse than the 2 minutes I had earlier or the four minutes
> > > with the previous patch.  
> > 
> > Are you using the patches I just sent?  
> 
> Not yet, I was still busy with the older version, and trying to
> figure out exactly what went wrong in ld.bfd. FWIW, I first tried
> to see if the hash tables were just too small, but as it turned
> out that was not the problem. When I tried to change the default
> hash table sizes, making them bigger only made things slower.
> 
> I also found the --hash-size=xxx option, which has a significant
> impact on runtime speed. Interestingly again, using sizes less
> than the default made things faster in practice. If we can
> work out the optimum size for the kernel build, that might
> shave a few minutes off the total build time.
> 
> > Either way, you also need
> > to do the same for data and bss sections as you are using
> > -fdata-sections too.  
> 
> Right.
> 
> > I've found virtually no build time regression on powerpc or x86
> > when those are taken care of properly (x86 numbers I sent are typo,
> > it's not 5m20, it's 5m02).  
> 
> Interesting. I wonder if it's got something to do with the
> generation of the branch trampolines on ARM, as we have a lot
> of them on an allyesconfig.

Powerpc generates quite a few branch trampolines as well, so
I'm not sure if that would be the issue. Can you get a profile
of the link?

Are you linking with archives? Do your input archives have a
symbol index built?


> Is the 5m20 the total build time for the kernel, the time for
> rebuilding after a trivial change, or the time to call 'ld.bfd'
> once?

5m02 was the total time for x86 defconfig. With the powerpc
allyesconfig build, the final link:

$ time ld -EL -m elf64lppc -pie --emit-relocs --build-id --gc-sections -X -o 
vmlinux -T ./arch/powerpc/kernel/vmlinux.lds --whole-archive built-in.o 
.tmp_kallsyms2.o

real0m15.556s
user0m13.288s
sys 0m2.240s

$ ls -lh vmlinux
-rwxrwxr-x 1 npiggin npiggin 279M Aug  6 14:02 vmlinux

Without -pie --emit-relocs it's 11.8s and 150M but I'm using
emit-relocs for a post-link step.


> Are you using ld.bfd on x86 or ld.gold? For me ld.gold either
> works and is really fast, or it crashes, depending on the
> configuration. I also don't think it supports big-endian ARM
> (which is what allyesconfig ends up using).

ld.bfd on both. Gold crashed on powerpc and I didn't try it on x86.

Thanks,
Nick


Re: [PATCH 1/5] kbuild: allow architectures to use thin archives instead of ld -r

2016-08-05 Thread kbuild test robot
Hi Stephen,

[auto build test ERROR on kbuild/for-next]
[also build test ERROR on v4.7]
[cannot apply to linus/master linux/master next-20160805]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Nicholas-Piggin/kbuild-changes-thin-archives-gc-sections/20160805-202258
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild.git 
for-next
config: um-allnoconfig (attached as .config)
compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
# save the attached .config to linux build tree
make ARCH=um 

All errors (new ones prefixed by >>):

   init/built-in.o:(.bss+0x4): multiple definition of `reset_devices'
   init/built-in.o:(.bss+0x4): first defined here
   init/built-in.o: In function `prepare_namespace':
   (.init.text+0x988): multiple definition of `prepare_namespace'
   init/built-in.o:(.init.text+0x988): first defined here
   init/built-in.o:(.init.data+0x1070): multiple definition of `late_time_init'
   init/built-in.o:(.init.data+0x1070): first defined here
   init/built-in.o: In function `load_default_modules':
   (.init.text+0x2f8): multiple definition of `load_default_modules'
   init/built-in.o:(.init.text+0x2f8): first defined here
   init/built-in.o:(.bss+0x10): multiple definition of `system_state'
   init/built-in.o:(.bss+0x10): first defined here
   init/built-in.o: In function `mount_root':
   (.init.text+0x987): multiple definition of `mount_root'
   init/built-in.o:(.init.text+0x987): first defined here
   init/built-in.o: In function `mount_block_root':
   (.init.text+0x836): multiple definition of `mount_block_root'
   init/built-in.o:(.init.text+0x836): first defined here
   init/built-in.o:(.bss+0x14): multiple definition of 
`early_boot_irqs_disabled'
   init/built-in.o:(.bss+0x14): first defined here
   init/built-in.o:(.data+0x0): multiple definition of `loops_per_jiffy'
   init/built-in.o:(.data+0x0): first defined here
   init/built-in.o:(.bss+0xc): multiple definition of `saved_command_line'
   init/built-in.o:(.bss+0xc): first defined here
   init/built-in.o: In function `calibrate_delay':
   (.text+0x2e0): multiple definition of `calibrate_delay'
   init/built-in.o:(.text+0x2e0): first defined here
   init/built-in.o: In function `do_one_initcall':
   (.init.text+0x57a): multiple definition of `do_one_initcall'
   init/built-in.o:(.init.text+0x57a): first defined here
   init/built-in.o:(.rodata+0x20): multiple definition of `linux_proc_banner'
   init/built-in.o:(.rodata+0x20): first defined here
   init/built-in.o:(.init.data+0x20e4): multiple definition of `rd_doload'
   init/built-in.o:(.init.data+0x20e4): first defined here
   init/built-in.o:(.data+0x620): multiple definition of `init_task'
   init/built-in.o:(.data+0x620): first defined here
   init/built-in.o:(.data+0x460): multiple definition of `init_uts_ns'
   init/built-in.o:(.data+0x460): first defined here
   init/built-in.o:(.data+0x20): multiple definition of `envp_init'
   init/built-in.o:(.data+0x20): first defined here
   init/built-in.o: In function `name_to_dev_t':
   (.text+0x70): multiple definition of `name_to_dev_t'
   init/built-in.o:(.text+0x70): first defined here
   init/built-in.o:(.data+0x618): multiple definition of `root_mountflags'
   init/built-in.o:(.data+0x618): first defined here
   init/built-in.o:(.bss+0x8): multiple definition of `static_key_initialized'
   init/built-in.o:(.bss+0x8): first defined here
   init/built-in.o: In function `start_kernel':
   (.init.text+0x351): multiple definition of `start_kernel'
   init/built-in.o:(.init.text+0x351): first defined here
>> init/built-in.o:(.bss+0x30): multiple definition of `Version_263936'
   init/built-in.o:(.bss+0x30): first defined here
   init/built-in.o: In function `parse_early_options':
   (.init.text+0x2f9): multiple definition of `parse_early_options'
   init/built-in.o:(.init.text+0x2f9): first defined here
   init/built-in.o: In function `parse_early_param':
   (.init.text+0x31a): multiple definition of `parse_early_param'
   init/built-in.o:(.init.text+0x31a): first defined here
   init/built-in.o:(.bss+0x40): multiple definition of `preset_lpj'
   init/built-in.o:(.bss+0x40): first defined here
   init/built-in.o:(.data..init_task+0x0): multiple definition of 
`init_thread_union'
   init/built-in.o:(.data..init_task+0x0): first defined here
   init/built-in.o: In function `init_rootfs':
   (.init.text+0x80a): multiple definition of `init_rootfs'
   init/built-in.o:(.init.text+0x80a): first defined here
   init/built-in.o:(.rodata+0x80): multiple definition of `linux_banner'
   init/built-in.o:(.rodata+0x80): first defined here
   init/built-in.o:(.bss+0x34): multiple definition of `ROOT_

Re: [PATCH] powernv: Load correct TOC pointer while waking up from winkle.

2016-08-05 Thread Mahesh Jagannath Salgaonkar
On 08/06/2016 04:08 AM, Benjamin Herrenschmidt wrote:
> On Fri, 2016-08-05 at 19:13 +0530, Mahesh J Salgaonkar wrote:
>> From: Mahesh Salgaonkar 
>>
>> The function pnv_restore_hyp_resource() loads the TOC into r2 from
>> the invalid PACA pointer before fixing r13 value. This do not affect
>> POWER ISA 3.0 but it does have an impact on POWER ISA 2.07 or less
>> leading CPU to get stuck forever.
> 
> When was this broken ? Should this get backported to stable ?

This is broken with recent Power9 cpu idle changes (commit bcef83a00)
that gone in Linus' master after V4.7. We are fine with v4.7

-Mahesh.

> 
>>  login: [  471.830433] Processor 120 is stuck.
>>
>>
>> This can be easily reproducible using following steps:
>> - Turn off SMT
>>  $ ppc64_cpu --smt=off
>> - offline/online any online cpu (Thread 0 of any core which is
>> online)
>>  $ echo 0 > /sys/devices/system/cpu/cpu/online
>>  $ echo 1 > /sys/devices/system/cpu/cpu/online
>>
>> For POWER ISA 2.07 or less, the last bit of HSPRG0 is set indicating
>> that thread is waking up from winkle. Hence, the last bit of
>> HSPRG0(r13)
>> needs to be clear before accessing it as PACA to avoid loading
>> invalid
>> values from invalid PACA pointer.
>>
>> Fix this by loading TOC after r13 register is corrected.
>>
>> Signed-off-by: Mahesh Salgaonkar 
>> ---
>>  arch/powerpc/kernel/idle_book3s.S |5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kernel/idle_book3s.S
>> b/arch/powerpc/kernel/idle_book3s.S
>> index 8a56a51..45784ec 100644
>> --- a/arch/powerpc/kernel/idle_book3s.S
>> +++ b/arch/powerpc/kernel/idle_book3s.S
>> @@ -363,8 +363,8 @@ _GLOBAL(power9_idle_stop)
>>   * cr3 - set to gt if waking up with partial/complete hypervisor
>> state loss
>>   */
>>  _GLOBAL(pnv_restore_hyp_resource)
>> -ld  r2,PACATOC(r13);
>>  BEGIN_FTR_SECTION
>> +ld  r2,PACATOC(r13);
>>  /*
>>   * POWER ISA 3. Use PSSCR to determine if we
>>   * are waking up from deep idle state
>> @@ -395,6 +395,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
>>   */
>>  clrldi  r5,r13,63
>>  clrrdi  r13,r13,1
>> +
>> +/* Now that we are sure r13 is corrected, load TOC */
>> +ld  r2,PACATOC(r13);
>>  cmpwi   cr4,r5,1
>>  mtspr   SPRN_HSPRG0,r13
>>  
> 



Re: Internal CompactFlash (CF) card device not recognised after the powerpc-4.8-1 merge

2016-08-05 Thread Darren Stevens
Hello Nicholas

On 06/08/2016, Nicholas Piggin wrote:
> On Fri, 5 Aug 2016 16:38:17 +0200
> Christian Zigotzky  wrote:
>
>> Hi All,
>
>
> Hi Christian,
[...]
> As for your driver support, it would indeed be a good idea to
> get it supported in the upstream kernel. You should post a
> new mail about that. Take a look at these 3 commits:
>
> 61f7162117d4767875825abf2f6ed1eeebbcceed
> 9cd55be4d22376893d2818ce3c0e5706a3d74121
> ca99140a63b7326ee9a38f64c326317f2c63b594
>
> Your patch comes from code based on the second one. The last
> commit removed it, and says that it is not the best way to
> implement it. You could cc this list and some of the people
> involved with those commits and ask ask for advice about
> getting your driver supported.

Actually, it's almost supported by the base PASemi code anyway, there is a
PCMCIA driver in setup.c, but our hardware has been changed enough to make it
hang the system when used. I should probably take another look and see If I
can fix that. The other option would be to move the above patch to where we
add our rtc platform device. One thing that does confuse me is the interrupt,
it's attached to GPIO 0, so should be on the mpic int 0 (or maybe 16?)but I
could never get it to work.

Regards
Darren



Re: [PATCH v3 2/2] powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.

2016-08-05 Thread Benjamin Herrenschmidt
On Fri, 2016-08-05 at 17:34 +0530, Mahesh J Salgaonkar wrote:
> From: Mahesh Salgaonkar 
> 
> The current implementation of MCE early handling modifies CR0/1
> registers
> without saving its old values. Fix this by moving early check for
> powersaving mode to machine_check_handle_early().

CC stable ?

> The power architecture 2.06 or later allows the possibility of
> getting
> machine check while in nap/sleep/winkle. The last bit of HSPRG0 is
> set
> to 1, if thread is woken up from winkle. Hence, clear the last bit of
> HSPRG0 (r13) before MCE handler starts using it as paca pointer.
> 
> Also, the current code always puts the thread into nap state
> irrespective
> of whatever idle state it woke up from. Fix that by looking at
> paca->thread_idle_state and put the thread back into same state where
> it
> came from.
> 
> Reported-by: Paul Mackerras 
> Signed-off-by: Mahesh Salgaonkar 
> Reviewed-by: Shreyas B. Prabhu 
> ---
> Change in v3:
> - Rebase to Linus' master.
> 
> Change in v2:
> - Call IDLE_STATE_ENTER_SEQ(PPC_NAP) instead of
> power7_enter_nap_mode()
>   to be consistent with other part of code.
> ---
>  arch/powerpc/kernel/exceptions-64s.S |   69 --
> 
>  1 file changed, 40 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64s.S
> b/arch/powerpc/kernel/exceptions-64s.S
> index 694def6..a59c9cc 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -144,29 +144,14 @@ machine_check_pSeries_1:
>    * vector
>    */
>   SET_SCRATCH0(r13)   /* save r13 */
> -#ifdef CONFIG_PPC_P7_NAP
> -BEGIN_FTR_SECTION
> - /* Running native on arch 2.06 or later, check if we are
> -  * waking up from nap. We only handle no state loss and
> -  * supervisor state loss. We do -not- handle hypervisor
> -  * state loss at this time.
> + /*
> +  * Running native on arch 2.06 or later, we may wakeup from
> winkle
> +  * inside machine check. If yes, then last bit of HSPGR0
> would be set
> +  * to 1. Hence clear it unconditionally.
>    */
> - mfspr   r13,SPRN_SRR1
> - rlwinm. r13,r13,47-31,30,31
> - OPT_GET_SPR(r13, SPRN_CFAR, CPU_FTR_CFAR)
> - beq 9f
> -
> - mfspr   r13,SPRN_SRR1
> - rlwinm. r13,r13,47-31,30,31
> - /* waking up from powersave (nap) state */
> - cmpwi   cr1,r13,2
> - /* Total loss of HV state is fatal. let's just stay stuck
> here */
> - OPT_GET_SPR(r13, SPRN_CFAR, CPU_FTR_CFAR)
> - bgt cr1,.
> -9:
> - OPT_SET_SPR(r13, SPRN_CFAR, CPU_FTR_CFAR)
> -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
> -#endif /* CONFIG_PPC_P7_NAP */
> + GET_PACA(r13)
> + clrrdi  r13,r13,1
> + SET_PACA(r13)
>   EXCEPTION_PROLOG_0(PACA_EXMC)
>  BEGIN_FTR_SECTION
>   b   machine_check_powernv_early
> @@ -1273,25 +1258,51 @@ machine_check_handle_early:
>    * Check if thread was in power saving mode. We come here
> when any
>    * of the following is true:
>    * a. thread wasn't in power saving mode
> -  * b. thread was in power saving mode with no state loss or
> -  *supervisor state loss
> +  * b. thread was in power saving mode with no state loss,
> +  *supervisor state loss or hypervisor state loss.
>    *
> -  * Go back to nap again if (b) is true.
> +  * Go back to nap/sleep/winkle mode again if (b) is true.
>    */
>   rlwinm. r11,r12,47-31,30,31 /* Was it in power
> saving mode? */
>   beq 4f  /* No, it wasn;t */
>   /* Thread was in power saving mode. Go back to nap again. */
>   cmpwi   r11,2
> - bne 3f
> - /* Supervisor state loss */
> + blt 3f
> + /* Supervisor/Hypervisor state loss */
>   li  r0,1
>   stb r0,PACA_NAPSTATELOST(r13)
>  3:   bl  machine_check_queue_event
>   MACHINE_CHECK_HANDLER_WINDUP
>   GET_PACA(r13)
>   ld  r1,PACAR1(r13)
> - li  r3,PNV_THREAD_NAP
> - b   pnv_enter_arch207_idle_mode
> + /*
> +  * Check what idle state this CPU was in and go back to same
> mode
> +  * again.
> +  */
> + lbz r3,PACA_THREAD_IDLE_STATE(r13)
> + cmpwi   r3,PNV_THREAD_NAP
> + bgt 10f
> + IDLE_STATE_ENTER_SEQ(PPC_NAP)
> + /* No return */
> +10:
> + cmpwi   r3,PNV_THREAD_SLEEP
> + bgt 2f
> + IDLE_STATE_ENTER_SEQ(PPC_SLEEP)
> + /* No return */
> +
> +2:
> + /*
> +  * Go back to winkle. Please note that this thread was woken
> up in
> +  * machine check from winkle and have not restored the per-
> subcore
> +  * state. Hence before going back to winkle, set last bit of
> HSPGR0
> +  * to 1. This will make sure that if this thread gets woken
> up
> +  * again at reset vector 0x100 then it will get chance to
> restore
> +  * the subcore state.
> +  */
> + ori r13,r13,1
> + SET_PACA(r13)
> + IDL

Re: [PATCH] powernv: Load correct TOC pointer while waking up from winkle.

2016-08-05 Thread Benjamin Herrenschmidt
On Fri, 2016-08-05 at 19:13 +0530, Mahesh J Salgaonkar wrote:
> From: Mahesh Salgaonkar 
> 
> The function pnv_restore_hyp_resource() loads the TOC into r2 from
> the invalid PACA pointer before fixing r13 value. This do not affect
> POWER ISA 3.0 but it does have an impact on POWER ISA 2.07 or less
> leading CPU to get stuck forever.

When was this broken ? Should this get backported to stable ?

>   login: [  471.830433] Processor 120 is stuck.
> 
> 
> This can be easily reproducible using following steps:
> - Turn off SMT
>   $ ppc64_cpu --smt=off
> - offline/online any online cpu (Thread 0 of any core which is
> online)
>   $ echo 0 > /sys/devices/system/cpu/cpu/online
>   $ echo 1 > /sys/devices/system/cpu/cpu/online
> 
> For POWER ISA 2.07 or less, the last bit of HSPRG0 is set indicating
> that thread is waking up from winkle. Hence, the last bit of
> HSPRG0(r13)
> needs to be clear before accessing it as PACA to avoid loading
> invalid
> values from invalid PACA pointer.
> 
> Fix this by loading TOC after r13 register is corrected.
> 
> Signed-off-by: Mahesh Salgaonkar 
> ---
>  arch/powerpc/kernel/idle_book3s.S |5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/idle_book3s.S
> b/arch/powerpc/kernel/idle_book3s.S
> index 8a56a51..45784ec 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -363,8 +363,8 @@ _GLOBAL(power9_idle_stop)
>   * cr3 - set to gt if waking up with partial/complete hypervisor
> state loss
>   */
>  _GLOBAL(pnv_restore_hyp_resource)
> - ld  r2,PACATOC(r13);
>  BEGIN_FTR_SECTION
> + ld  r2,PACATOC(r13);
>   /*
>    * POWER ISA 3. Use PSSCR to determine if we
>    * are waking up from deep idle state
> @@ -395,6 +395,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
>    */
>   clrldi  r5,r13,63
>   clrrdi  r13,r13,1
> +
> + /* Now that we are sure r13 is corrected, load TOC */
> + ld  r2,PACATOC(r13);
>   cmpwi   cr4,r5,1
>   mtspr   SPRN_HSPRG0,r13
>  


Re: [Patch v3 01/11] arch/powerpc/pci: Fix compiling error for mpc85xx_edac

2016-08-05 Thread Scott Wood
On Fri, 2016-08-05 at 21:20 +, york sun wrote:
> On 08/05/2016 02:09 PM, Scott Wood wrote:
> > 
> > On Fri, 2016-08-05 at 20:29 +, york sun wrote:
> > > 
> > > On 08/04/2016 08:43 PM, Michael Ellerman wrote:
> > > > 
> > > > 
> > > > Does the driver really need to use these routines? They're meant for
> > > > use
> > > > early in boot, before PCI is setup.
> > > > 
> > > > AFAICS this is just a regular driver, so when it's probed the PCI
> > > > devices should have already been scanned. In which case
> > > > pci_get_device()
> > > > could work couldn't it? (I see other edac drivers doing that).
> > > I am trying to fix this but need some help. We are dealing with PCIe
> > > controller here. Does it have a bus number assigned at this point? If
> > > yes, how can I find it? I seem not able to find out where the
> > > platform_data is filled as well. Can someone kindly point it out to me?
> > 
> > The platform data comes from add_err_dev() in
> > arch/powerpc/sysdev/fsl_pci.c.
> > 
> Thanks, Scott.
> 
> When add_err_dev() is called, pci is not scanned, is using 
> early_find_capability() justified?

The edac driver is registered with a normal device-level initcall.  The PCI
scanning appears to happen at the subsys initcall level.

-Scott



Re: [Patch v3 01/11] arch/powerpc/pci: Fix compiling error for mpc85xx_edac

2016-08-05 Thread york sun
On 08/05/2016 02:09 PM, Scott Wood wrote:
> On Fri, 2016-08-05 at 20:29 +, york sun wrote:
>> On 08/04/2016 08:43 PM, Michael Ellerman wrote:
>>>
>>> Does the driver really need to use these routines? They're meant for use
>>> early in boot, before PCI is setup.
>>>
>>> AFAICS this is just a regular driver, so when it's probed the PCI
>>> devices should have already been scanned. In which case pci_get_device()
>>> could work couldn't it? (I see other edac drivers doing that).
>> I am trying to fix this but need some help. We are dealing with PCIe
>> controller here. Does it have a bus number assigned at this point? If
>> yes, how can I find it? I seem not able to find out where the
>> platform_data is filled as well. Can someone kindly point it out to me?
>
>
> The platform data comes from add_err_dev() in arch/powerpc/sysdev/fsl_pci.c.
>

Thanks, Scott.

When add_err_dev() is called, pci is not scanned, is using 
early_find_capability() justified?

York


Re: [Patch v3 01/11] arch/powerpc/pci: Fix compiling error for mpc85xx_edac

2016-08-05 Thread Scott Wood
On Fri, 2016-08-05 at 20:29 +, york sun wrote:
> On 08/04/2016 08:43 PM, Michael Ellerman wrote:
> > 
> > Does the driver really need to use these routines? They're meant for use
> > early in boot, before PCI is setup.
> > 
> > AFAICS this is just a regular driver, so when it's probed the PCI
> > devices should have already been scanned. In which case pci_get_device()
> > could work couldn't it? (I see other edac drivers doing that).
> I am trying to fix this but need some help. We are dealing with PCIe 
> controller here. Does it have a bus number assigned at this point? If 
> yes, how can I find it? I seem not able to find out where the 
> platform_data is filled as well. Can someone kindly point it out to me?


The platform data comes from add_err_dev() in arch/powerpc/sysdev/fsl_pci.c.

-Scott



Re: [PATCH v2 3/3] kexec: extend kexec_file_load system call

2016-08-05 Thread Thiago Jung Bauermann
Hi,

Am Dienstag, 26 Juli 2016, 21:24:29 schrieb Thiago Jung Bauermann:
> Notes:
> This is a new version of the last patch in this series which adds
> a function where each architecture can verify if the DTB is safe
> to load:
> 
> int __weak arch_kexec_verify_buffer(enum kexec_file_type type,
> const void *buf,
> unsigned long size)
> {
> return -EINVAL;
> }
> 
> I will then provide an implementation in my powerpc patch series
> which checks that the DTB only contains nodes and properties from a
> whitelist. arch_kexec_kernel_image_load will copy these properties
> to the device tree blob the kernel was booted with (and perform
> other changes such as setting /chosen/bootargs, of course).

Is this approach ok? If so, I'll post a patch next week adding an 
arch_kexec_verify_buffer hook for powerpc to enforce the whitelist, and also 
a new version of the patches implementing kexec_file_load for powerpc on top 
of this series.

Eric, does this address your concerns?

-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center



Re: [Patch v3 01/11] arch/powerpc/pci: Fix compiling error for mpc85xx_edac

2016-08-05 Thread york sun
On 08/04/2016 08:43 PM, Michael Ellerman wrote:
> York Sun  writes:
>
>> Two symbols are missing if mpc85xx_edac driver is compiled as module.
>>
>> Signed-off-by: York Sun 
>>
>> ---
>> Change log
>>   v3: Change subject tag
>>   v2: no change
>>
>>  arch/powerpc/kernel/pci-common.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/powerpc/kernel/pci-common.c 
>> b/arch/powerpc/kernel/pci-common.c
>> index 0f7a60f..86bc484 100644
>> --- a/arch/powerpc/kernel/pci-common.c
>> +++ b/arch/powerpc/kernel/pci-common.c
>> @@ -226,6 +226,7 @@ struct pci_controller* 
>> pci_find_hose_for_OF_device(struct device_node* node)
>>  }
>>  return NULL;
>>  }
>> +EXPORT_SYMBOL(pci_find_hose_for_OF_device);
>>
>>  /*
>>   * Reads the interrupt pin to determine if interrupt is use by card.
>> @@ -1585,6 +1586,7 @@ int early_find_capability(struct pci_controller *hose, 
>> int bus, int devfn,
>>  {
>>  return pci_bus_find_capability(fake_pci_bus(hose, bus), devfn, cap);
>>  }
>> +EXPORT_SYMBOL(early_find_capability);
>
> Does the driver really need to use these routines? They're meant for use
> early in boot, before PCI is setup.
>
> AFAICS this is just a regular driver, so when it's probed the PCI
> devices should have already been scanned. In which case pci_get_device()
> could work couldn't it? (I see other edac drivers doing that).

I am trying to fix this but need some help. We are dealing with PCIe 
controller here. Does it have a bus number assigned at this point? If 
yes, how can I find it? I seem not able to find out where the 
platform_data is filled as well. Can someone kindly point it out to me?

York



Re: [patch] powerpc/fsl_rio: fix a missing error code

2016-08-05 Thread Dan Carpenter
On Thu, Aug 04, 2016 at 01:16:00PM -0700, Andrew Morton wrote:
> On Thu, 4 Aug 2016 08:35:25 +0300 Dan Carpenter  
> wrote:
> 
> > We should set the error code here.  Otherwise static checkers complain.
> > 
> 
> hm.
> 
> > --- a/arch/powerpc/sysdev/fsl_rio.c
> > +++ b/arch/powerpc/sysdev/fsl_rio.c
> > @@ -491,6 +491,7 @@ int fsl_rio_setup(struct platform_device *dev)
> > rmu_node = of_parse_phandle(dev->dev.of_node, "fsl,srio-rmu-handle", 0);
> > if (!rmu_node) {
> > dev_err(&dev->dev, "No valid fsl,srio-rmu-handle property\n");
> > +   rc = -ENOENT;
> > goto err_rmu;
> > }
> > rc = of_address_to_resource(rmu_node, 0, &rmu_regs);
> 
> afaict the function will return 0 in this case, which is a flat out
> bug.  But why do static checkers complain?  The code will return a
> suitably initialized value?
> 
> IOW, please always quote the checker/compiler output when fixing a bug!

Coccinelle has a check for missing error codes and I'm working on one
for Smatch as well.

regards,
dan carpenter



Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Arnd Bergmann
On Saturday, August 6, 2016 2:16:42 AM CEST Nicholas Piggin wrote:
> > 
> > diff --git a/include/asm-generic/vmlinux.lds.h 
> > b/include/asm-generic/vmlinux.lds.h
> > index 0ec807d69f18..7a3ad269fa23 100644
> > --- a/include/asm-generic/vmlinux.lds.h
> > +++ b/include/asm-generic/vmlinux.lds.h
> > @@ -433,7 +433,7 @@
> >   * during second ld run in second ld pass when generating System.map */
> >  #define TEXT_TEXT\
> >   ALIGN_FUNCTION();   \
> > - *(.text.hot .text .text.fixup .text.unlikely)   \
> > + *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
> >   *(.ref.text)\
> >   MEM_KEEP(init.text) \
> >   MEM_KEEP(exit.text) \
> > 
> > 
> > It also got much faster again, the link time for an allyesconfig
> > kernel is now 18 minutes instead of 10 hours, but it's still
> > much worse than the 2 minutes I had earlier or the four minutes
> > with the previous patch.
> 
> Are you using the patches I just sent?

Not yet, I was still busy with the older version, and trying to
figure out exactly what went wrong in ld.bfd. FWIW, I first tried
to see if the hash tables were just too small, but as it turned
out that was not the problem. When I tried to change the default
hash table sizes, making them bigger only made things slower.

I also found the --hash-size=xxx option, which has a significant
impact on runtime speed. Interestingly again, using sizes less
than the default made things faster in practice. If we can
work out the optimum size for the kernel build, that might
shave a few minutes off the total build time.

> Either way, you also need
> to do the same for data and bss sections as you are using
> -fdata-sections too.

Right.

> I've found virtually no build time regression on powerpc or x86
> when those are taken care of properly (x86 numbers I sent are typo,
> it's not 5m20, it's 5m02).

Interesting. I wonder if it's got something to do with the
generation of the branch trampolines on ARM, as we have a lot
of them on an allyesconfig.

Is the 5m20 the total build time for the kernel, the time for
rebuilding after a trivial change, or the time to call 'ld.bfd'
once?

Are you using ld.bfd on x86 or ld.gold? For me ld.gold either
works and is really fast, or it crashes, depending on the
configuration. I also don't think it supports big-endian ARM
(which is what allyesconfig ends up using).

Arnd


Re: [Patch v3 01/11] arch/powerpc/pci: Fix compiling error for mpc85xx_edac

2016-08-05 Thread york sun
On 08/04/2016 04:36 PM, Andrew Donnellan wrote:
> On 05/08/16 08:58, York Sun wrote:
>> Two symbols are missing if mpc85xx_edac driver is compiled as module.
>>
>> Signed-off-by: York Sun 
>
> Good catch! One comment below.
>
> Reviewed-by: Andrew Donnellan 
>
>>  /*
>>   * Reads the interrupt pin to determine if interrupt is use by card.
>> @@ -1585,6 +1586,7 @@ int early_find_capability(struct pci_controller *hose, 
>> int bus, int devfn,
>>  {
>>  return pci_bus_find_capability(fake_pci_bus(hose, bus), devfn, cap);
>>  }
>> +EXPORT_SYMBOL(early_find_capability);
>
> It'd be nicer for this to be renamed as "pci_early_find_capability" or
> something like that with a "namespace", I think.
>

I will rename it if I respin this patch for any reason. Otherwise, I 
will send out another patch to rename it after merging.

York



Re: [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size

2016-08-05 Thread Paul Clarke

Only nits from me...(see below)

On 08/05/2016 01:30 PM, Sukadev Bhattiprolu wrote:

Here is an updated patch to fix the build when CONFIG_PPC_PSERIES=n.
---
From d4f77a6ca7b6ea83f6588e7d541cc70bf001ae85 Mon Sep 17 00:00:00 2001
From: root 
Date: Thu, 4 Aug 2016 23:13:37 -0400
Subject: [PATCH 2/2] powerpc/pseries: Dynamically grow RMA size

When booting a very large system with a larg initrd we run out of space
for the flattened device tree (FDT). To fix this we must increase the
space allocated for the RMA region.

The RMA size is hard-coded in the 'ibm_architecture_vec[]' and increasing
the size there will apply to all systems, small and large, so we want to
increase the RMA region only when necessary.

When we run out of room for the FDT, set a new OF property, 'ibm,new-rma-size'
to the new RMA size (512MB) and issue a client-architecture-support (CAS)
call to the firmware. This will initiate a system reboot. Upon reboot we
notice the new property and update the RMA size accordingly.

Fix suggested by Michael Ellerman.

Signed-off-by: Sukadev Bhattiprolu 
---

[v2]:   - Add a comment in code regarding 'fixup_nr_cores_done'
- Fix build break when CONFIG_PPC_PSERIES=n
---
 arch/powerpc/kernel/prom_init.c | 96 -
 1 file changed, 95 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index f612a99..cbd5387 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -679,6 +679,7 @@ unsigned char ibm_architecture_vec[] = {
W(0x),  /* virt_base */
W(0x),  /* virt_size */
W(0x),  /* load_base */
+#define IBM_ARCH_VEC_MIN_RMA_OFFSET108
W(256), /* 256MB min RMA */
W(0x),  /* full client load */
0,  /* min RMA percentage of total RAM */
@@ -867,6 +868,14 @@ static void fixup_nr_cores(void)
 {
u32 cores;
unsigned char *ptcores;
+   static bool fixup_nr_cores_done = false;
+
+   /*
+* If this is a second CAS call in the same boot sequence, (see
+* increase_rma_size()), we don't need to do the fixup again.
+*/
+   if (fixup_nr_cores_done)
+   return;

/* We need to tell the FW about the number of cores we support.
 *
@@ -898,6 +907,41 @@ static void fixup_nr_cores(void)
ptcores[1] = (cores >> 16) & 0xff;
ptcores[2] = (cores >> 8) & 0xff;
ptcores[3] = cores & 0xff;
+   fixup_nr_cores_done = true;
+   }
+}
+
+static void __init fixup_rma_size(void)
+{
+   int rc;
+   u64 size;
+   unsigned char *min_rmap;
+   phandle optnode;
+   char str[64];
+
+   optnode = call_prom("finddevice", 1, 1, ADDR("/options"));
+   if (!PHANDLE_VALID(optnode))
+   prom_panic("Cannot find /options");
+
+   /*
+* If a prior boot specified a new RMA size, use that size in
+* ibm_architecture_vec[]. See also increase_rma_size().
+*/
+   size = 0ULL;
+   memset(str, 0, sizeof(str));
+   rc = prom_getprop(optnode, "ibm,new-rma-size", &str, sizeof(str));
+   if (rc <= 0)
+   return;
+
+   size = prom_strtoul(str, NULL);
+   min_rmap = &ibm_architecture_vec[IBM_ARCH_VEC_MIN_RMA_OFFSET];
+
+   if (size) {
+   prom_printf("Using RMA size %lu from ibm,new-rma-size.\n", 
size);
+   min_rmap[0] = (size >> 24) & 0xff;
+   min_rmap[1] = (size >> 16) & 0xff;
+   min_rmap[2] = (size >> 8) & 0xff;
+   min_rmap[3] = size & 0xff;
}
 }

@@ -911,6 +955,8 @@ static void __init prom_send_capabilities(void)

fixup_nr_cores();

+   fixup_rma_size();
+
/* try calling the ibm,client-architecture-support method */
prom_printf("Calling ibm,client-architecture-support...");
if (call_prom_ret("call-method", 3, 2, &ret,
@@ -946,6 +992,52 @@ static void __init prom_send_capabilities(void)
}
 #endif /* __BIG_ENDIAN__ */
 }
+
+static void __init increase_rma_size(void)
+{
+   int rc;
+   u64 size;
+   char str[64];
+   phandle optnode;
+
+   optnode = call_prom("finddevice", 1, 1, ADDR("/options"));
+   if (!PHANDLE_VALID(optnode))
+   prom_panic("Cannot find /options");
+
+   /*
+* If we already increased the RMA size, return.
+*/
+   size = 0ULL;
+   memset(str, 0, sizeof(str));
+   rc = prom_getprop(optnode, "ibm,new-rma-size", &str, sizeof(str));
+
+   size = prom_strtoul(str, NULL);
+   if (size == 512ULL) {


Is this preferred over strncmp?  Using a string also helps with my suggestion 
below...


+   prom_printf("RMA size already at %lu.\n", size);
+ 

RE: [PATCH V2 7/7] thermal: qoriq: Add thermal management support

2016-08-05 Thread Hongtao Jia
Hi Eduardo,

If you have any comments please let me know.

Thanks.
-Hongtao. 


> -Original Message-
> From: Linuxppc-dev [mailto:linuxppc-dev-
> bounces+b38951=freescale@lists.ozlabs.org] On Behalf Of Hongtao Jia
> Sent: Tuesday, July 19, 2016 2:54 PM
> To: edubez...@gmail.com; rui.zh...@intel.com; Scott Wood
> 
> Cc: devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; linux-
> p...@vger.kernel.org
> Subject: RE: [PATCH V2 7/7] thermal: qoriq: Add thermal management support
> 
> Hi Eduardo,
> 
> Any comments on this patch?
> 
> Thanks.
> -Hongtao.
> 
> > -Original Message-
> > From: Jia Hongtao [mailto:hongtao@nxp.com]
> > Sent: Thursday, June 30, 2016 11:09 AM
> > To: edubez...@gmail.com; rui.zh...@intel.com; robh...@kernel.org;
> > ga...@codeaurora.org; Scott Wood ;
> > shawn...@kernel.org
> > Cc: linux...@vger.kernel.org; devicet...@vger.kernel.org; linux-
> > ker...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-arm-
> > ker...@lists.infradead.org; Hongtao Jia 
> > Subject: [PATCH V2 7/7] thermal: qoriq: Add thermal management support
> >
> > This driver add thermal management support by enabling TMU (Thermal
> > Monitoring Unit) on QorIQ platform.
> >
> > It's based on thermal of framework:
> > - Trip points defined in device tree.
> > - Cpufreq as cooling device registered in qoriq cpufreq driver.
> >
> > Signed-off-by: Jia Hongtao 
> > ---
> > Changes of V2:
> > * Add HAS_IOMEM dependency to fix build error on UM
> >
> >  drivers/thermal/Kconfig |  10 ++
> >  drivers/thermal/Makefile|   1 +
> >  drivers/thermal/qoriq_thermal.c | 328
> > 
> >  3 files changed, 339 insertions(+)
> >  create mode 100644 drivers/thermal/qoriq_thermal.c
> >
> > diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig index
> > 2d702ca..56ef30d 100644
> > --- a/drivers/thermal/Kconfig
> > +++ b/drivers/thermal/Kconfig
> > @@ -195,6 +195,16 @@ config IMX_THERMAL
> >   cpufreq is used as the cooling device to throttle CPUs when the
> >   passive trip is crossed.
> >
> > +config QORIQ_THERMAL
> > +   tristate "QorIQ Thermal Monitoring Unit"
> > +   depends on THERMAL_OF
> > +   depends on HAS_IOMEM
> > +   help
> > + Support for Thermal Monitoring Unit (TMU) found on QorIQ platforms.
> > + It supports one critical trip point and one passive trip point. The
> > + cpufreq is used as the cooling device to throttle CPUs when the
> > + passive trip is crossed.
> > +
> >  config SPEAR_THERMAL
> > tristate "SPEAr thermal sensor driver"
> > depends on PLAT_SPEAR || COMPILE_TEST diff --git
> > a/drivers/thermal/Makefile b/drivers/thermal/Makefile index
> > 10b07c1..6662232 100644
> > --- a/drivers/thermal/Makefile
> > +++ b/drivers/thermal/Makefile
> > @@ -37,6 +37,7 @@ obj-$(CONFIG_DB8500_THERMAL)  +=
> db8500_thermal.o
> >  obj-$(CONFIG_ARMADA_THERMAL)   += armada_thermal.o
> >  obj-$(CONFIG_TANGO_THERMAL)+= tango_thermal.o
> >  obj-$(CONFIG_IMX_THERMAL)  += imx_thermal.o
> > +obj-$(CONFIG_QORIQ_THERMAL)+= qoriq_thermal.o
> >  obj-$(CONFIG_DB8500_CPUFREQ_COOLING)   += db8500_cpufreq_cooling.o
> >  obj-$(CONFIG_INTEL_POWERCLAMP) += intel_powerclamp.o
> >  obj-$(CONFIG_X86_PKG_TEMP_THERMAL) += x86_pkg_temp_thermal.o
> > diff --git a/drivers/thermal/qoriq_thermal.c
> > b/drivers/thermal/qoriq_thermal.c new file mode 100644 index
> > 000..644ba52
> > --- /dev/null
> > +++ b/drivers/thermal/qoriq_thermal.c
> > @@ -0,0 +1,328 @@
> > +/*
> > + * Copyright 2016 Freescale Semiconductor, Inc.
> > + *
> > + * This program is free software; you can redistribute it and/or
> > +modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but
> > +WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> > +or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> > +License for
> > + * more details.
> > + *
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "thermal_core.h"
> > +
> > +#define SITES_MAX  16
> > +
> > +/*
> > + * QorIQ TMU Registers
> > + */
> > +struct qoriq_tmu_site_regs {
> > +   u32 tritsr; /* Immediate Temperature Site Register */
> > +   u32 tratsr; /* Average Temperature Site Register */
> > +   u8 res0[0x8];
> > +};
> > +
> > +struct qoriq_tmu_regs {
> > +   u32 tmr;/* Mode Register */
> > +#define TMR_DISABLE0x0
> > +#define TMR_ME 0x8000
> > +#define TMR_ALPF   0x0c00
> > +   u32 tsr;/* Status Register */
> > +   u32 tmtmir; /* Temperature measurement interval Register */
> > +#define TMTMIR_DEFAULT  

Re: [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size

2016-08-05 Thread Sukadev Bhattiprolu
Here is an updated patch to fix the build when CONFIG_PPC_PSERIES=n.
---
>From d4f77a6ca7b6ea83f6588e7d541cc70bf001ae85 Mon Sep 17 00:00:00 2001
From: root 
Date: Thu, 4 Aug 2016 23:13:37 -0400
Subject: [PATCH 2/2] powerpc/pseries: Dynamically grow RMA size

When booting a very large system with a larg initrd we run out of space
for the flattened device tree (FDT). To fix this we must increase the
space allocated for the RMA region.

The RMA size is hard-coded in the 'ibm_architecture_vec[]' and increasing
the size there will apply to all systems, small and large, so we want to
increase the RMA region only when necessary.

When we run out of room for the FDT, set a new OF property, 'ibm,new-rma-size'
to the new RMA size (512MB) and issue a client-architecture-support (CAS)
call to the firmware. This will initiate a system reboot. Upon reboot we
notice the new property and update the RMA size accordingly.

Fix suggested by Michael Ellerman.

Signed-off-by: Sukadev Bhattiprolu 
---

[v2]:   - Add a comment in code regarding 'fixup_nr_cores_done'
- Fix build break when CONFIG_PPC_PSERIES=n
---
 arch/powerpc/kernel/prom_init.c | 96 -
 1 file changed, 95 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index f612a99..cbd5387 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -679,6 +679,7 @@ unsigned char ibm_architecture_vec[] = {
W(0x),  /* virt_base */
W(0x),  /* virt_size */
W(0x),  /* load_base */
+#define IBM_ARCH_VEC_MIN_RMA_OFFSET108
W(256), /* 256MB min RMA */
W(0x),  /* full client load */
0,  /* min RMA percentage of total RAM */
@@ -867,6 +868,14 @@ static void fixup_nr_cores(void)
 {
u32 cores;
unsigned char *ptcores;
+   static bool fixup_nr_cores_done = false;
+
+   /*
+* If this is a second CAS call in the same boot sequence, (see
+* increase_rma_size()), we don't need to do the fixup again.
+*/
+   if (fixup_nr_cores_done)
+   return;
 
/* We need to tell the FW about the number of cores we support.
 *
@@ -898,6 +907,41 @@ static void fixup_nr_cores(void)
ptcores[1] = (cores >> 16) & 0xff;
ptcores[2] = (cores >> 8) & 0xff;
ptcores[3] = cores & 0xff;
+   fixup_nr_cores_done = true;
+   }
+}
+
+static void __init fixup_rma_size(void)
+{
+   int rc;
+   u64 size;
+   unsigned char *min_rmap;
+   phandle optnode;
+   char str[64];
+
+   optnode = call_prom("finddevice", 1, 1, ADDR("/options"));
+   if (!PHANDLE_VALID(optnode))
+   prom_panic("Cannot find /options");
+
+   /*
+* If a prior boot specified a new RMA size, use that size in
+* ibm_architecture_vec[]. See also increase_rma_size().
+*/
+   size = 0ULL;
+   memset(str, 0, sizeof(str));
+   rc = prom_getprop(optnode, "ibm,new-rma-size", &str, sizeof(str));
+   if (rc <= 0)
+   return;
+
+   size = prom_strtoul(str, NULL);
+   min_rmap = &ibm_architecture_vec[IBM_ARCH_VEC_MIN_RMA_OFFSET];
+
+   if (size) {
+   prom_printf("Using RMA size %lu from ibm,new-rma-size.\n", 
size);
+   min_rmap[0] = (size >> 24) & 0xff;
+   min_rmap[1] = (size >> 16) & 0xff;
+   min_rmap[2] = (size >> 8) & 0xff;
+   min_rmap[3] = size & 0xff;
}
 }
 
@@ -911,6 +955,8 @@ static void __init prom_send_capabilities(void)
 
fixup_nr_cores();
 
+   fixup_rma_size();
+
/* try calling the ibm,client-architecture-support method */
prom_printf("Calling ibm,client-architecture-support...");
if (call_prom_ret("call-method", 3, 2, &ret,
@@ -946,6 +992,52 @@ static void __init prom_send_capabilities(void)
}
 #endif /* __BIG_ENDIAN__ */
 }
+
+static void __init increase_rma_size(void)
+{
+   int rc;
+   u64 size;
+   char str[64];
+   phandle optnode;
+
+   optnode = call_prom("finddevice", 1, 1, ADDR("/options"));
+   if (!PHANDLE_VALID(optnode))
+   prom_panic("Cannot find /options");
+
+   /*
+* If we already increased the RMA size, return.
+*/
+   size = 0ULL;
+   memset(str, 0, sizeof(str));
+   rc = prom_getprop(optnode, "ibm,new-rma-size", &str, sizeof(str));
+
+   size = prom_strtoul(str, NULL);
+   if (size == 512ULL) {
+   prom_printf("RMA size already at %lu.\n", size);
+   return;
+   }
+   /*
+* Otherwise, set the ibm,new-rma-size property and initiate a CAS
+* reboot so the RMA size can take effect. See also ini

Re: [PATCH v13 3/6] CPM/QE: use genalloc to manage CPM/QE muram

2016-08-05 Thread Christophe Leroy



Le 30/11/2015 à 03:48, Zhao Qiang a écrit :

Use genalloc to manage CPM/QE muram instead of rheap.

Signed-off-by: Zhao Qiang 
---
Changes for v9:
- splitted from patch 3/5, modify cpm muram management functions.
Changes for v10:
- modify cpm muram first, then move to qe_common
- modify commit.
Changes for v11:
- factor out the common alloc code
- modify min_alloc_order to zero for cpm_muram_alloc_fixed.
Changes for v12:
- Nil
Changes for v13:
- rebase

 arch/powerpc/include/asm/cpm.h   |   3 +
 arch/powerpc/platforms/Kconfig   |   4 +-
 arch/powerpc/sysdev/cpm_common.c | 126 +++
 lib/genalloc.c   |   2 +-
 4 files changed, 94 insertions(+), 41 deletions(-)



With that patch applied, I get the following Oops on a 8xx (Which has a 
CPM1).


cpm_muram_init() is called from setup_arch()

It seems that gen_pool_add() tries to kmalloc() memory but the SLAB is 
not available yet.


[0.00] Unable to handle kernel paging request for data at 
address 0x0008

[0.00] Faulting instruction address: 0xc01acce0
[0.00] Oops: Kernel access of bad area, sig: 11 [#1]
[0.00] PREEMPT CMPC885
[0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 
4.4.14-s3k-dev-g0886ed8-svn #5

[0.00] task: c05183e0 ti: c0536000 task.ti: c0536000
[0.00] NIP: c01acce0 LR: c0011068 CTR: 
[0.00] REGS: c0537e50 TRAP: 0300   Not tainted 
(4.4.14-s3k-dev-g0886ed8-svn)

[0.00] MSR: 1032   CR: 28044428  XER: 
[0.00] DAR: 0008 DSISR: c000
GPR00: c0011068 c0537f00 c05183e0  9000  0bc0 

GPR08: ff003000 ff00b000 ff003bbf  22044422 100d43a8  
07ff94e8
GPR16:  07bb5d70  07ff81f4 07ff81f4 07ff81f4  

GPR24: 07ffb3a0 07fe7628 c055 c7ffa190 c054 ff003bbf  
0001

[0.00] NIP [c01acce0] gen_pool_add_virt+0x14/0xdc
[0.00] LR [c0011068] cpm_muram_init+0xd4/0x18c
[0.00] Call Trace:
[0.00] [c0537f00] [0200] 0x200 (unreliable)
[0.00] [c0537f20] [c0011068] cpm_muram_init+0xd4/0x18c
[0.00] [c0537f70] [c0494684] cpm_reset+0xb4/0xc8
[0.00] [c0537f90] [c0494c64] cmpc885_setup_arch+0x10/0x30
[0.00] [c0537fa0] [c0493cd4] setup_arch+0x130/0x168
[0.00] [c0537fb0] [c04906bc] start_kernel+0x88/0x380
[0.00] [c0537ff0] [c0002224] start_here+0x38/0x98
[0.00] Instruction dump:
[0.00] 91430010 91430014 80010014 83e1000c 7c0803a6 38210010 
4e800020 7c0802a6
[0.00] 9421ffe0 bf61000c 90010024 7c7e1b78 <80630008> 7c9c2378 
7cc31c30 3863001f

[0.00] ---[ end trace dc8fa200cb88537f ]---


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Nicholas Piggin
On Fri, 05 Aug 2016 18:01:13 +0200
Arnd Bergmann  wrote:

> On Friday, August 5, 2016 10:26:25 PM CEST Nicholas Piggin wrote:
> > On Fri, 05 Aug 2016 12:17:27 +0200
> > Arnd Bergmann  wrote:  
> 
> > > and I also get link errors for the .text.fixup section
> > > for any users of __put_user() in really large kernels:
> > > net/batman-adv/batman-adv.o:(.text.fixup+0x4): relocation truncated to 
> > > fit: R_ARM_JUMP24 against `.text.batadv_log_read'  
> > 
> > This may be fixed by fixing the linker script to bring in the new
> > sections properly (see new patchset).
> > 
> > If not, then if you can combine the sections rather than have them
> > consecutive in the output, e.g.,:
> > 
> > *(.text .text.fixup)
> > 
> > Rather than
> > 
> > *(.text)
> > *(.text.fixup)
> > 
> > Then the linker has more freedom to rearrange them. I realize it's
> > not that simple with ARM's .text.fixup, but maybe that helps you
> > get it to work.  
> 
> This did the trick:
> 
> diff --git a/include/asm-generic/vmlinux.lds.h 
> b/include/asm-generic/vmlinux.lds.h
> index 0ec807d69f18..7a3ad269fa23 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -433,7 +433,7 @@
>   * during second ld run in second ld pass when generating System.map */
>  #define TEXT_TEXT\
>   ALIGN_FUNCTION();   \
> - *(.text.hot .text .text.fixup .text.unlikely)   \
> + *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
>   *(.ref.text)\
>   MEM_KEEP(init.text) \
>   MEM_KEEP(exit.text) \
> 
> 
> It also got much faster again, the link time for an allyesconfig
> kernel is now 18 minutes instead of 10 hours, but it's still
> much worse than the 2 minutes I had earlier or the four minutes
> with the previous patch.

Are you using the patches I just sent? Either way, you also need
to do the same for data and bss sections as you are using
-fdata-sections too.

I've found virtually no build time regression on powerpc or x86
when those are taken care of properly (x86 numbers I sent are typo,
it's not 5m20, it's 5m02).

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Arnd Bergmann
On Friday, August 5, 2016 10:26:25 PM CEST Nicholas Piggin wrote:
> On Fri, 05 Aug 2016 12:17:27 +0200
> Arnd Bergmann  wrote:

> > and I also get link errors for the .text.fixup section
> > for any users of __put_user() in really large kernels:
> > net/batman-adv/batman-adv.o:(.text.fixup+0x4): relocation truncated to fit: 
> > R_ARM_JUMP24 against `.text.batadv_log_read'
> 
> This may be fixed by fixing the linker script to bring in the new
> sections properly (see new patchset).
> 
> If not, then if you can combine the sections rather than have them
> consecutive in the output, e.g.,:
> 
> *(.text .text.fixup)
> 
> Rather than
> 
> *(.text)
> *(.text.fixup)
> 
> Then the linker has more freedom to rearrange them. I realize it's
> not that simple with ARM's .text.fixup, but maybe that helps you
> get it to work.

This did the trick:

diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 0ec807d69f18..7a3ad269fa23 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -433,7 +433,7 @@
  * during second ld run in second ld pass when generating System.map */
 #define TEXT_TEXT  \
ALIGN_FUNCTION();   \
-   *(.text.hot .text .text.fixup .text.unlikely)   \
+   *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
*(.ref.text)\
MEM_KEEP(init.text) \
MEM_KEEP(exit.text) \


It also got much faster again, the link time for an allyesconfig
kernel is now 18 minutes instead of 10 hours, but it's still
much worse than the 2 minutes I had earlier or the four minutes
with the previous patch.

Arnd


Re: [PATCH 1/7] ima: on soft reboot, restore the measurement list

2016-08-05 Thread Petko Manolov
On 16-08-05 09:34:38, Mimi Zohar wrote:
> Hi Petko,
> 
> Thank you for review!
> 
> On Fri, 2016-08-05 at 11:44 +0300, Petko Manolov wrote:
> > On 16-08-04 08:24:29, Mimi Zohar wrote:
> > > The TPM PCRs are only reset on a hard reboot.  In order to validate a
> > > TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list
> > > of the running kernel must be saved and restored on boot.  This patch
> > > restores the measurement list.
> > > 
> > > Changelog:
> > > - call ima_load_kexec_buffer() (Thiago)
> > > 
> > > Signed-off-by: Mimi Zohar 
> > > ---
> > >  security/integrity/ima/Makefile   |   1 +
> > >  security/integrity/ima/ima.h  |  10 ++
> > >  security/integrity/ima/ima_init.c |   2 +
> > >  security/integrity/ima/ima_kexec.c|  55 +++
> > >  security/integrity/ima/ima_queue.c|  10 ++
> > >  security/integrity/ima/ima_template.c | 171 
> > > ++
> > >  6 files changed, 249 insertions(+)
> > >  create mode 100644 security/integrity/ima/ima_kexec.c
> > > 
> > > diff --git a/security/integrity/ima/Makefile 
> > > b/security/integrity/ima/Makefile
> > > index c34599f..c0ce7b1 100644
> > > --- a/security/integrity/ima/Makefile
> > > +++ b/security/integrity/ima/Makefile
> > > @@ -8,4 +8,5 @@ obj-$(CONFIG_IMA) += ima.o
> > >  ima-y := ima_fs.o ima_queue.o ima_init.o ima_main.o ima_crypto.o 
> > > ima_api.o \
> > >ima_policy.o ima_template.o ima_template_lib.o ima_buffer.o
> > >  ima-$(CONFIG_IMA_APPRAISE) += ima_appraise.o
> > > +ima-$(CONFIG_KEXEC_FILE) += ima_kexec.o
> > >  obj-$(CONFIG_IMA_BLACKLIST_KEYRING) += ima_mok.o
> > > diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
> > > index b5728da..84e8d36 100644
> > > --- a/security/integrity/ima/ima.h
> > > +++ b/security/integrity/ima/ima.h
> > > @@ -102,6 +102,13 @@ struct ima_queue_entry {
> > >  };
> > >  extern struct list_head ima_measurements;/* list of all 
> > > measurements */
> > >  
> > > +/* Some details preceding the binary serialized measurement list */
> > > +struct ima_kexec_hdr {
> > > + unsigned short version;
> > > + unsigned long buffer_size;
> > > + unsigned long count;
> > > +} __packed;
> > 
> > Unless there is no real need for this structure to be packed i suggest 
> > dropping the attribute.  When referenced through pointer 32bit ARM and MIPS 
> > (and likely all other 32bit RISC CPUs) use rather inefficient byte loads 
> > and 
> > stores.
> > 
> > Worse, if, for example, ->count is going to be read/written concurrently 
> > from multiple threads we get torn loads/stores thus losing atomicity of the 
> > access.
> 
> This header is used to prefix the serialized binary measurement list with 
> some 
> meta-data about the measurement list being restored. Unfortunately 
> kexec_get_handover_buffer() returns the segment size, not the actual ima 
> measurement list buffer size.  The header info is set using memcpy() once in 
> ima_dump_measurement_list() and then the fields are used in 
> ima_restore_measurement_list() to verify the buffer.

As long as there is no concurrent reads/writes this should be OK.

> The binary runtime measurement list is packed, so the other two structures - 
> binary_hdr_v1 and binary_data_v1 - must be packed.  Does it make sense for 
> this header not to be packed as well?  Would copying the header fields to 
> local variables before being used solve your concern?

Copying to aligned variables would be necessary only if:

a) some sort of atomicity is needed, and/or
б) speed is of concern;

> Remember this code is used once on the kexec execute and again on reboot.

If we don't need a) _and_ b) then you don't need to bother.


Petko


Re: [pasemi] Internal CompactFlash (CF) card device not recognised after the powerpc-4.8-1 merge

2016-08-05 Thread Nicholas Piggin
On Fri, 5 Aug 2016 16:38:17 +0200
Christian Zigotzky  wrote:

> Hi All,


Hi Christian,

Firstly, thanks for the report. As a suggestion, it can be
better to reduce the CC list unless you have found a specific
commit that causes the problem. powerpc developers read this
list so you can send initial bug reports there.

Have you been able to bisect the exact commit which caused the
regression? If it's possible, that is the fastest way to get a
response to your report.

As for your driver support, it would indeed be a good idea to
get it supported in the upstream kernel. You should post a
new mail about that. Take a look at these 3 commits:

61f7162117d4767875825abf2f6ed1eeebbcceed
9cd55be4d22376893d2818ce3c0e5706a3d74121
ca99140a63b7326ee9a38f64c326317f2c63b594

Your patch comes from code based on the second one. The last
commit removed it, and says that it is not the best way to
implement it. You could cc this list and some of the people
involved with those commits and ask ask for advice about
getting your driver supported.

Thanks,
Nick


> I have found some information about the electra IDE CF card device: 
> http://linuxppc-dev.ozlabs.narkive.com/kxQRFqGe/patch-pasemi-electra-ide-pata-platform-glue
> 
> I checked the kernel config and the merge but without any success.
> 
> Cheers,
> 
> Christian
> 
> On 05 August 2016 at 1:41 PM, Christian Zigotzky wrote:
> > Hi All,
> >
> > The internal PASEMI CompactFlash (CF) card device doesn't work anymore 
> > after the powerpc-4.8-1 merge. That means the code for the internal CF 
> > card device in the Nemo patch doesn't work after the first PowerPC 
> > merge. The CompactFlash (CF) card slot is wired to the CPU local bus. 
> > It is typically used to hold the Linux kernel. I know it isn't well to 
> > use an own patch for that but I think it is a good time to integrate 
> > the PASEMI internal CompactFlash (CF) card device to the official 
> > kernel. What do you think? I am not a programmer so I can't integrate 
> > the source code for the internal CF card device. But maybe you can 
> > take the patch and integrate it.
> >
> > We use the following patch for the kernel 4.7:


[pasemi] Problem with the PA Semi PWRficient Gigabit Ethernet

2016-08-05 Thread Christian Zigotzky

Hi All,

It seems there is a problem with the PA Semi PWRficient Gigabit 
Ethernet. It tries very often to connect to the network but there isn't 
a network cable plugged in. There are two new commits about the PA Semi 
PWRficient Gigabit Ethernet in the Git source code.


- 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6cf285de0231e53057726aea1fb87ab772765cb7
- 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=80721e7fa68f914a21b2685a01399b3aa80ac116


Cheers,

Christian


[pasemi] Internal CompactFlash (CF) card device not recognised after the powerpc-4.8-1 merge

2016-08-05 Thread Christian Zigotzky

Hi All,

I have found some information about the electra IDE CF card device: 
http://linuxppc-dev.ozlabs.narkive.com/kxQRFqGe/patch-pasemi-electra-ide-pata-platform-glue


I checked the kernel config and the merge but without any success.

Cheers,

Christian

On 05 August 2016 at 1:41 PM, Christian Zigotzky wrote:

Hi All,

The internal PASEMI CompactFlash (CF) card device doesn't work anymore 
after the powerpc-4.8-1 merge. That means the code for the internal CF 
card device in the Nemo patch doesn't work after the first PowerPC 
merge. The CompactFlash (CF) card slot is wired to the CPU local bus. 
It is typically used to hold the Linux kernel. I know it isn't well to 
use an own patch for that but I think it is a good time to integrate 
the PASEMI internal CompactFlash (CF) card device to the official 
kernel. What do you think? I am not a programmer so I can't integrate 
the source code for the internal CF card device. But maybe you can 
take the patch and integrate it.


We use the following patch for the kernel 4.7:

diff -rupN a/drivers/ata/pata_of_platform.c 
b/drivers/ata/pata_of_platform.c
--- a/drivers/ata/pata_of_platform.c   2016-08-05 
09:58:41.410569036 +0200
+++ b/drivers/ata/pata_of_platform.c   2016-08-05 
09:59:54.41424 +0200

@@ -41,14 +41,36 @@ static int pata_of_platform_probe(struct
   return -EINVAL;
}

-   ret = of_address_to_resource(dn, 1, &ctl_res);
-   if (ret) {
-  dev_err(&ofdev->dev, "can't get CTL address from "
- "device tree\n");
-  return -EINVAL;
+   if (of_device_is_compatible(dn, "electra-ide")) {
+  /* Altstatus is really at offset 0x3f6 from the primary window
+   * on electra-ide. Adjust ctl_res and io_res accordingly.
+   */
+  ctl_res = io_res;
+  ctl_res.start = ctl_res.start+0x3f6;
+  io_res.end = ctl_res.start-1;
+
+#ifdef CONFIG_PPC_PASEMI_SB600
+   } else if (of_device_is_compatible(dn, "electra-cf")) {
+   /* Task regs are at 0x800, with alt status @ 0x80e 
in the primary window
+* on electra-cf. Adjust ctl_res and io_res 
accordingly.

+*/
+   ctl_res = io_res;
+   io_res.start += 0x800;
+   ctl_res.start = ctl_res.start + 0x80e;
+   io_res.end = ctl_res.start-1;
+#endif
+   } else {
+  ret = of_address_to_resource(dn, 1, &ctl_res);
+  if (ret) {
+ dev_err(&ofdev->dev, "can't get CTL address from "
+"device tree\n");
+ return -EINVAL;
+  }
}

irq_res = platform_get_resource(ofdev, IORESOURCE_IRQ, 0);
+   if (irq_res)
+  irq_res->flags = 0;

prop = of_get_property(dn, "reg-shift", NULL);
if (prop)
@@ -65,6 +87,11 @@ static int pata_of_platform_probe(struct
   dev_info(&ofdev->dev, "pio-mode unspecified, assuming 
PIO0\n");

}

+#ifdef CONFIG_PPC_PASEMI_SB600
+   irq_res = 0;// force irq off (doesn't 
seem to work)

+#endif
+
+
pio_mask = 1 << pio_mode;
pio_mask |= (1 << pio_mode) - 1;

@@ -74,7 +101,11 @@ static int pata_of_platform_probe(struct

 static struct of_device_id pata_of_platform_match[] = {
{ .compatible = "ata-generic", },
-   { },
+   { .compatible = "electra-ide", },
+#ifdef CONFIG_PPC_PASEMI_SB600
+   { .compatible = "electra-cf",},
+#endif
+   {},
 };
 MODULE_DEVICE_TABLE(of, pata_of_platform_match);

dmesg with the kernel 4.7:

zcat /var/log/dmesg.1.gz | grep -i ata7

[2.939788] ata7: PATA max PIO0 no IRQ, using PIO polling mmio 
cmd 0xf800 ctl 0xf80e

[3.099186] ata7.00: CFA: SanDisk SDCFB-256, HDX 2.33, max PIO4
[3.099191] ata7.00: 501760 sectors, multi 0: LBA
[3.099199] ata7.00: configured for PIO

The dmesg of the latest Git kernel doesn't have any output of our 
internal CF card device.


Could you please integrate our PASEMI CF card device again?

Thanks,

Christian






Re: [PATCH 3/5] kbuild: add arch specific post-module-link pass

2016-08-05 Thread Nicholas Piggin
On Fri,  5 Aug 2016 22:12:01 +1000
Nicholas Piggin  wrote:

> Add an option for architectures to pass over modules after they are
> linked. powerpc will use this to fix up alternate instruction patch
> relocations.

For that matter, now I think about it, I'd like to have this generic
postmod pass for the vmlinux as well. And it would be  to call into
the arch Makefile rather than just supply a tool.

Currently powerpc deals with it by adding dependencies on its zImage
target, but it would be really nice to be able to fix that while we're
here too. Is that going to work?

Thanks,
Nick


[PATCH] powernv: Load correct TOC pointer while waking up from winkle.

2016-08-05 Thread Mahesh J Salgaonkar
From: Mahesh Salgaonkar 

The function pnv_restore_hyp_resource() loads the TOC into r2 from
the invalid PACA pointer before fixing r13 value. This do not affect
POWER ISA 3.0 but it does have an impact on POWER ISA 2.07 or less
leading CPU to get stuck forever.

login: [  471.830433] Processor 120 is stuck.


This can be easily reproducible using following steps:
- Turn off SMT
$ ppc64_cpu --smt=off
- offline/online any online cpu (Thread 0 of any core which is online)
$ echo 0 > /sys/devices/system/cpu/cpu/online
$ echo 1 > /sys/devices/system/cpu/cpu/online

For POWER ISA 2.07 or less, the last bit of HSPRG0 is set indicating
that thread is waking up from winkle. Hence, the last bit of HSPRG0(r13)
needs to be clear before accessing it as PACA to avoid loading invalid
values from invalid PACA pointer.

Fix this by loading TOC after r13 register is corrected.

Signed-off-by: Mahesh Salgaonkar 
---
 arch/powerpc/kernel/idle_book3s.S |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/idle_book3s.S 
b/arch/powerpc/kernel/idle_book3s.S
index 8a56a51..45784ec 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -363,8 +363,8 @@ _GLOBAL(power9_idle_stop)
  * cr3 - set to gt if waking up with partial/complete hypervisor state loss
  */
 _GLOBAL(pnv_restore_hyp_resource)
-   ld  r2,PACATOC(r13);
 BEGIN_FTR_SECTION
+   ld  r2,PACATOC(r13);
/*
 * POWER ISA 3. Use PSSCR to determine if we
 * are waking up from deep idle state
@@ -395,6 +395,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
 */
clrldi  r5,r13,63
clrrdi  r13,r13,1
+
+   /* Now that we are sure r13 is corrected, load TOC */
+   ld  r2,PACATOC(r13);
cmpwi   cr4,r5,1
mtspr   SPRN_HSPRG0,r13
 



Re: [PATCH 1/7] ima: on soft reboot, restore the measurement list

2016-08-05 Thread Mimi Zohar
Hi Petko,

Thank you for review!

On Fri, 2016-08-05 at 11:44 +0300, Petko Manolov wrote:
> On 16-08-04 08:24:29, Mimi Zohar wrote:
> > The TPM PCRs are only reset on a hard reboot.  In order to validate a
> > TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list
> > of the running kernel must be saved and restored on boot.  This patch
> > restores the measurement list.
> > 
> > Changelog:
> > - call ima_load_kexec_buffer() (Thiago)
> > 
> > Signed-off-by: Mimi Zohar 
> > ---
> >  security/integrity/ima/Makefile   |   1 +
> >  security/integrity/ima/ima.h  |  10 ++
> >  security/integrity/ima/ima_init.c |   2 +
> >  security/integrity/ima/ima_kexec.c|  55 +++
> >  security/integrity/ima/ima_queue.c|  10 ++
> >  security/integrity/ima/ima_template.c | 171 
> > ++
> >  6 files changed, 249 insertions(+)
> >  create mode 100644 security/integrity/ima/ima_kexec.c
> > 
> > diff --git a/security/integrity/ima/Makefile 
> > b/security/integrity/ima/Makefile
> > index c34599f..c0ce7b1 100644
> > --- a/security/integrity/ima/Makefile
> > +++ b/security/integrity/ima/Makefile
> > @@ -8,4 +8,5 @@ obj-$(CONFIG_IMA) += ima.o
> >  ima-y := ima_fs.o ima_queue.o ima_init.o ima_main.o ima_crypto.o ima_api.o 
> > \
> >  ima_policy.o ima_template.o ima_template_lib.o ima_buffer.o
> >  ima-$(CONFIG_IMA_APPRAISE) += ima_appraise.o
> > +ima-$(CONFIG_KEXEC_FILE) += ima_kexec.o
> >  obj-$(CONFIG_IMA_BLACKLIST_KEYRING) += ima_mok.o
> > diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
> > index b5728da..84e8d36 100644
> > --- a/security/integrity/ima/ima.h
> > +++ b/security/integrity/ima/ima.h
> > @@ -102,6 +102,13 @@ struct ima_queue_entry {
> >  };
> >  extern struct list_head ima_measurements;  /* list of all measurements */
> >  
> > +/* Some details preceding the binary serialized measurement list */
> > +struct ima_kexec_hdr {
> > +   unsigned short version;
> > +   unsigned long buffer_size;
> > +   unsigned long count;
> > +} __packed;
> 
> Unless there is no real need for this structure to be packed i suggest 
> dropping 
> the attribute.  When referenced through pointer 32bit ARM and MIPS (and 
> likely 
> all other 32bit RISC CPUs) use rather inefficient byte loads and stores.
> 
> Worse, if, for example, ->count is going to be read/written concurrently from 
> multiple threads we get torn loads/stores thus losing atomicity of the access.

This header is used to prefix the serialized binary measurement list
with some meta-data about the measurement list being restored.
Unfortunately kexec_get_handover_buffer() returns the segment size, not
the actual ima measurement list buffer size.  The header info is set
using memcpy() once in ima_dump_measurement_list() and then the fields
are used in ima_restore_measurement_list() to verify the buffer.

The binary runtime measurement list is packed, so the other two
structures - binary_hdr_v1 and binary_data_v1 - must be packed.  Does it
make sense for this header not to be packed as well?  Would copying the
header fields to local variables before being used solve your concern?

Remember this code is used once on the kexec execute and again on
reboot.

Mimi



Re: [RFC][PATCH 0/5] kbuild changes, thin archives, --gc-sections

2016-08-05 Thread Nicholas Piggin
On Fri,  5 Aug 2016 22:11:58 +1000
Nicholas Piggin  wrote:

> Hello,
> 
> I have 3 different things in this patchset. All arch specific, but all
> involve kbuild changes, so I'd like to discuss them with kbuild
> maintainers. The goal has been to improve long standing linking
> difficulties with the powerpc kernel.

Here's a 30 second hack of an x86 patch. It seems to build and
boot defconfig in a really quick kvm test.

For x86-64 machine building x86 target, defconfig,

make -j8 vmlinux time:
  orig  thinarc   thinarc+dce
real 4m58.865s4m59.747s 5m20.028s
user15m14.428s   15m13.868s15m17.012s
sys  0m57.296s0m55.904s 0m58.416s

build output directory size:
  orig  thinarc   thinarc+dce
  317M 257M  285M

vmlinux size:
text data  bss   dec   filename
10192338  4360136  1105920  15658394   vmlinux
10186739  4356040  1105920  15648699   vmlinux.thinarc
 9580486  3759880  1011712  14352078   vmlinux.thinarc+dce

Thanks,
Nick


diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0a7b885..845069e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -51,6 +51,8 @@ config X86
select ARCH_WANT_IPC_PARSE_VERSION  if X86_32
select ARCH_WANT_OPTIONAL_GPIOLIB
select BUILDTIME_EXTABLE_SORT
+   select THIN_ARCHIVES
+   select LINKER_DCE
select CLKEVT_I8253
select CLKSRC_I8253 if X86_32
select CLOCKSOURCE_VALIDATE_LAST_CYCLE
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 9297a00..7395dd8 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -92,7 +92,7 @@ SECTIONS
.text :  AT(ADDR(.text) - LOAD_OFFSET) {
_text = .;
/* bootstrapping code */
-   HEAD_TEXT
+   KEEP(HEAD_TEXT)
. = ALIGN(8);
_stext = .;
TEXT_TEXT
@@ -321,7 +321,7 @@ SECTIONS
.bss : AT(ADDR(.bss) - LOAD_OFFSET) {
__bss_start = .;
*(.bss..page_aligned)
-   *(.bss)
+   *(.bss .bss.*)
. = ALIGN(PAGE_SIZE);
__bss_stop = .;
}


Re: [PATCH 2/2] powerpc/pseries: Dynamically increase RMA size

2016-08-05 Thread kbuild test robot
Hi Sukadev,

[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.7 next-20160805]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Sukadev-Bhattiprolu/powerpc-pseries-Use-a-helper-to-fixup-nr_cores/20160805-141813
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-kmeter1_defconfig (attached as .config)
compiler: powerpc-linux-gnu-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/kernel/prom_init.c: In function 'make_room':
>> arch/powerpc/kernel/prom_init.c:2113:4: error: implicit declaration of 
>> function 'increase_rma_size' [-Werror=implicit-function-declaration]
   increase_rma_size();
   ^
   cc1: all warnings being treated as errors

vim +/increase_rma_size +2113 arch/powerpc/kernel/prom_init.c

  2107  prom_debug("Chunk exhausted, claiming more at %x...\n",
  2108 alloc_bottom);
  2109  room = alloc_top - alloc_bottom;
  2110  if (room > DEVTREE_CHUNK_SIZE)
  2111  room = DEVTREE_CHUNK_SIZE;
  2112  if (room < PAGE_SIZE) {
> 2113  increase_rma_size();
  2114  prom_panic("No memory for flatten_device_tree "
  2115 "(no room)\n");
  2116  }

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


[pasemi] Radeon HD graphics card not recognised after the powerpc-4.8-1 commit

2016-08-05 Thread Christian Zigotzky

Michael,

Thanks a lot for the hints! I will use your commands. I am still 
learning Linux. :-)


Cheers,

Christian

On 05 August 2016 at 12:59 PM, Michael Ellerman wrote:

Christian Zigotzky  writes:


Hi Michael,

Thanks a million for your patch! :-)

No worries :)


@All
Keep your fingers crossed!

1. git clone
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a

Normally that would be:

$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
linux

And this:


2. patch -p0 < powerpc-pci-Only-do-fixed-PHB-numbering-on-powernv.patch

Would be:

$ cd linux
$ patch -p1 < powerpc-pci-Only-do-fixed-PHB-numbering-on-powernv.patch


3. patch -p0 < nemo_4.8-3.patch

4. yes "" | make oldconfig

And that can be done with 'make olddefconfig'.

cheers





RE: [v5.1] ucc_fast: Fix to avoid IS_ERR_VALUE abuses and dead code on 64bit systems.

2016-08-05 Thread David Laight
From: Arvind Yadav
> Sent: 04 August 2016 17:53
> IS_ERR_VALUE() assumes that parameter is an unsigned long.
> It can not be used to check if 'unsigned int' is passed insted.
> Which tends to reflect an error.
> In 64bit architectures sizeof (int) == 4 && sizeof (long) == 8.
> IS_ERR_VALUE(x) is ((x) >= (unsigned long)-4095).
> IS_ERR_VALUE() of 'unsigned int' is always false because the 32bit
> value is zero extended to 64 bits.
> 
> Now Problem In UCC fast protocols -: drivers/soc/fsl/qe/ucc_fast.c
> 
> /* Allocate memory for Tx Virtual Fifo */
> uccf->ucc_fast_tx_virtual_fifo_base_offset =
>   qe_muram_alloc(uf_info->utfs, UCC_FAST_VIRT_FIFO_REGS_ALIGNMENT);
> if (IS_ERR_VALUE(uccf->ucc_fast_tx_virtual_fifo_base_offset)) {
> printk(KERN_ERR "%s: cannot allocate MURAM for TX FIFO\n",
> __func__);
> uccf->ucc_fast_tx_virtual_fifo_base_offset = 0;
> ucc_fast_free(uccf);
> return -ENOMEM;
> }
> 
> /* Allocate memory for Rx Virtual Fifo */
> uccf->ucc_fast_rx_virtual_fifo_base_offset =
>qe_muram_alloc(uf_info->urfs +
>UCC_FAST_RECEIVE_VIRTUAL_FIFO_SIZE_FUDGE_FACTOR,
>UCC_FAST_VIRT_FIFO_REGS_ALIGNMENT);
> if (IS_ERR_VALUE(uccf->ucc_fast_rx_virtual_fifo_base_offset)) {
> printk(KERN_ERR "%s: cannot allocate MURAM for RX FIFO\n",
> __func__);
> uccf->ucc_fast_rx_virtual_fifo_base_offset = 0;
> ucc_fast_free(uccf);
> return -ENOMEM;
> }
> 
> qe_muram_alloc (a.k.a. cpm_muram_alloc) returns unsigned long.
> Return value store in a u32 (ucc_fast_tx_virtual_fifo_base_offset
> and ucc_fast_rx_virtual_fifo_base_offset).If qe_muram_alloc will
> return any error, Then IS_ERR_VALUE will always return 0. it'll not
> call ucc_fast_free for any failure. Inside 'if code' will be a dead
> code on 64bit.
> This patch is to avoid this problem on 64bit machine.

That is really far too wordy for a commit message.

My suspicion is that qe_muram_alloc() always returns a value that is much
less than 2^32 - even though the return type is 'long'.

Looking further all this code is a bag of worms.

The 'fail' return value from qe_muram_alloc() (aka cpm_muram_alloc()) is
never returned to an outer level.
It might be better to return a constant CPM_MURAL_ALLOC_FAIL (say 0x7fff)
and have the callers check that (via a #define).

That is only the start of the problems...

It looks very likely that cpm_muram_free() will be called in tidy up paths
when cpm_muram_alloc() either failed, or hasn't been called.
Since 0 is a valid return value, and there is no check for -ENOMEM it is
all an 'accident waiting to happen'.

>From my quick scan (grep -B2 -A6) I'm not at all sure most of the error paths
at best leak memory.

David



Re: [PATCH 10/11] soc: ti: knav_qmss_queue: use of_property_read_bool

2016-08-05 Thread Julia Lawall


On Fri, 5 Aug 2016, Robin Murphy wrote:

> Hi Julia,
>
> On 05/08/16 09:56, Julia Lawall wrote:
> > Use of_property_read_bool to check for the existence of a property.
>
> This caught my eye since Rob told me off for doing the same recently[1].
>
> > The semantic patch that makes this change is as follows:
> > (http://coccinelle.lip6.fr/)
> >
> > // 
> > @@
> > expression e1,e2;
> > statement S2,S1;
> > @@
> > -   if (of_get_property(e1,e2,NULL))
> > +   if (of_property_read_bool(e1,e2))
> > S1 else S2
> > // 
> >
> > Signed-off-by: Julia Lawall 
> >
> > ---
> >  drivers/soc/ti/knav_qmss_queue.c |2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/soc/ti/knav_qmss_queue.c 
> > b/drivers/soc/ti/knav_qmss_queue.c
> > index b73e353..56b5d7c 100644
> > --- a/drivers/soc/ti/knav_qmss_queue.c
> > +++ b/drivers/soc/ti/knav_qmss_queue.c
> > @@ -1240,7 +1240,7 @@ static int knav_setup_queue_range(struct knav_device 
> > *kdev,
> > if (of_get_property(node, "qalloc-by-id", NULL))
>
> According to the binding, "qalloc-by-id" _is_ a boolean property, so
> this one really does deserve to be of_property_read_bool()...
>
> > range->flags |= RANGE_RESERVED;
> >
> > -   if (of_get_property(node, "accumulator", NULL)) {
> > +   if (of_property_read_bool(node, "accumulator")) {
>
> ...whereas "accumulator" must have a value, so this isn't technically
> appropriate. In general, most of these "if the property exists, read the
> property and do stuff" checks are probably a sign of code that could be
> simplified by refactoring the "do stuff" step to just specifically
> handle the "read the property" step returning -EINVAL when it's not present.

Thanks for the very helpful feedback.  I will rethink the patch set in
light of this information.

julia

> Robin.
>
> [1]:https://www.mail-archive.com/iommu@lists.linux-foundation.org/msg13375.html
>
> > ret = knav_init_acc_range(kdev, node, range);
> > if (ret < 0) {
> > devm_kfree(dev, range);
> >
> >
> > ___
> > linux-arm-kernel mailing list
> > linux-arm-ker...@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >
>
> --
> To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Nicholas Piggin
On Fri, 05 Aug 2016 12:17:27 +0200
Arnd Bergmann  wrote:

> On Friday, August 5, 2016 6:41:08 PM CEST Nicholas Piggin wrote:
> > On Thu, 4 Aug 2016 12:06:41 -0500
> > Segher Boessenkool  wrote:
> >   
> > > On Thu, Aug 04, 2016 at 06:10:57PM +0200, Arnd Bergmann wrote:  
> > > > On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> > > > 
> > > > > + __used  \
> > > > > + __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), 
> > > > > used)) \
> > > > 
> > > > 
> > > > I've just started testing this, but the first problem I ran into
> > > > is that @ and # are special characters that have an architecture
> > > > specific meaning to the assembler. On ARM, you need "%note @" instead
> > > > of "@note #".
> > > 
> > > That comment trick (I still feel guilty about it) causes more problems
> > > than it solves.  Please don't try to use it :-)  
> > 
> > Yeah that's a funny hack. I don't think it's required though, but I'm just
> > running through some more tests.
> > 
> > I think I found an improvement with the thin archives as well -- we were
> > still building symbol table after removing the s option (that only avoids
> > index). "S" is required to not build symbol table.
> > 
> > I'll send out an RFC on a slightly more polished patch series shortly.  
> 
> 
> I could not find Nico's patches, but based on the information in his
> presentation at
> 
> https://www.linuxplumbersconf.org/2015/ocw//system/presentations/3369/original/slides.html#(1)
> 
> I created a patch for ARM that mirrors what you have for powerpc, see
> below.

Great, thanks for jumping in. I posted another set which is a lot improved
you should pick up.


> I have successfully built normal-sized kernels with this (not tried
> running them). Unfortunately, the build time for "allyesconfig"
> kernel explodes, the final link time is now in the hours instead of
> minutes (no exact numbers unfortunately, it takes too long to
> reproduce),

That's becase we need to coalesce the new sections properly into the
output file. binutils does not cope with vast number of sections in
final linked file and spends all its time in hash lookup then explodes
usually.


> and I also get link errors for the .text.fixup section
> for any users of __put_user() in really large kernels:
> net/batman-adv/batman-adv.o:(.text.fixup+0x4): relocation truncated to fit: 
> R_ARM_JUMP24 against `.text.batadv_log_read'

This may be fixed by fixing the linker script to bring in the new
sections properly (see new patchset).

If not, then if you can combine the sections rather than have them
consecutive in the output, e.g.,:

*(.text .text.fixup)

Rather than

*(.text)
*(.text.fixup)

Then the linker has more freedom to rearrange them. I realize it's
not that simple with ARM's .text.fixup, but maybe that helps you
get it to work.

Thanks,
Nick


[PATCH 5/5] powerpc/64: use linker dce

2016-08-05 Thread Nicholas Piggin
---
 arch/powerpc/kernel/Makefile   | 3 +++
 arch/powerpc/kernel/vmlinux.lds.S  | 2 +-
 arch/powerpc/platforms/Kconfig.cputype | 1 +
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 2da380f..b356e59 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -4,7 +4,10 @@
 
 CFLAGS_ptrace.o+= -DUTS_MACHINE='"$(UTS_MACHINE)"'
 
+ccflags-y  += -fno-function-sections -fno-data-sections
+
 subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror
+subdir-ccflags-y   += -fno-function-sections -fno-data-sections
 
 ifeq ($(CONFIG_PPC64),y)
 CFLAGS_prom_init.o += $(NO_MINIMAL_TOC)
diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index 2dd91f7..c157b8d 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -50,7 +50,7 @@ SECTIONS
HEAD_TEXT
_text = .;
/* careful! __ftr_alt_* sections need to be close to .text */
-   *(.text .fixup __ftr_alt_* .ref.text)
+   *(.text .text.* .fixup __ftr_alt_* .ref.text)
SCHED_TEXT
LOCK_TEXT
KPROBES_TEXT
diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index 3c77091..6afeb9d 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -3,6 +3,7 @@ config PPC64
default n
select HAVE_VIRT_CPU_ACCOUNTING
select THIN_ARCHIVES
+   select LINKER_DCE
select ZLIB_DEFLATE
help
  This option selects whether a 32-bit or a 64-bit kernel
-- 
2.8.1



[PATCH 4/5] powerpc: switch to using thin archives

2016-08-05 Thread Nicholas Piggin
From: Stephen Rothwell 

Some change to the way we invoke ar is required so it can be used
by scripts/link-vmlinux.sh

Signed-off-by: Stephen Rothwell 
Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/Makefile  | 6 --
 arch/powerpc/platforms/Kconfig.cputype | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 709a22a..160837c 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -23,7 +23,8 @@ CROSS32AR := $(CROSS32_COMPILE)ar
 ifeq ($(HAS_BIARCH),y)
 ifeq ($(CROSS32_COMPILE),)
 CROSS32CC  := $(CC) -m32
-CROSS32AR  := GNUTARGET=elf32-powerpc $(AR)
+CROSS32AR  := $(AR)
+KBUILD_ARFLAGS += --target elf32-powerpc
 endif
 endif
 
@@ -93,7 +94,8 @@ ifeq ($(HAS_BIARCH),y)
 override AS+= -a$(CONFIG_WORD_SIZE)
 override LD+= -m elf$(CONFIG_WORD_SIZE)$(LDEMULATION)
 override CC+= -m$(CONFIG_WORD_SIZE)
-override AR:= GNUTARGET=elf$(CONFIG_WORD_SIZE)-$(GNUTARGET) $(AR)
+override AR:= $(AR)
+KBUILD_ARFLAGS += --target elf$(CONFIG_WORD_SIZE)-$(GNUTARGET)
 endif
 
 LDFLAGS_vmlinux-y := -Bstatic
diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index 77e9b8d..3c77091 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -2,6 +2,7 @@ config PPC64
bool "64-bit kernel"
default n
select HAVE_VIRT_CPU_ACCOUNTING
+   select THIN_ARCHIVES
select ZLIB_DEFLATE
help
  This option selects whether a 32-bit or a 64-bit kernel
-- 
2.8.1



[PATCH 3/5] kbuild: add arch specific post-module-link pass

2016-08-05 Thread Nicholas Piggin
Add an option for architectures to pass over modules after they are
linked. powerpc will use this to fix up alternate instruction patch
relocations.

Signed-off-by: Nicholas Piggin 
---
 Documentation/kbuild/makefiles.txt | 6 ++
 Makefile   | 1 +
 scripts/Makefile.modpost   | 8 
 3 files changed, 15 insertions(+)

diff --git a/Documentation/kbuild/makefiles.txt 
b/Documentation/kbuild/makefiles.txt
index 13f888a..f6c065b 100644
--- a/Documentation/kbuild/makefiles.txt
+++ b/Documentation/kbuild/makefiles.txt
@@ -952,6 +952,12 @@ When kbuild executes, the following steps are followed 
(roughly):
$(KBUILD_ARFLAGS) set by the top level Makefile to "D" (deterministic
mode) if this option is supported by $(AR).
 
+KBUILD_MODPOST_TOOL   Arch-specific command to run after module link
+
+$(KBUILD_MODPOST_TOOL) is used to add an arch-specific pass over
+modules after their final link. E.g., powerpc uses this to adjust
+relative branches of "alternate code patching" sections.
+
 ARCH_CPPFLAGS, ARCH_AFLAGS, ARCH_CFLAGS   Overrides the kbuild defaults
 
These variables are appended to the KBUILD_CPPFLAGS,
diff --git a/Makefile b/Makefile
index d5ef31a..99ab8eb 100644
--- a/Makefile
+++ b/Makefile
@@ -421,6 +421,7 @@ export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE
 export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE
 export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL
 export KBUILD_ARFLAGS
+export KBUILD_MODPOST_TOOL
 
 # When compiling out-of-tree modules, put MODVERDIR in the module
 # tree rather than in the kernel tree. The kernel tree might
diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost
index 1366a94..19f8481 100644
--- a/scripts/Makefile.modpost
+++ b/scripts/Makefile.modpost
@@ -121,8 +121,16 @@ quiet_cmd_ld_ko_o = LD [M]  $@
  $(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE) \
  -o $@ $(filter-out FORCE,$^)
 
+ifdef KBUILD_MODPOST_TOOL
+quiet_cmd_arch_modpost = ARCH$@
+  cmd_arch_modpost = $(KBUILD_MODPOST_TOOL) $@
+endif
+
 $(modules): %.ko :%.o %.mod.o FORCE
$(call if_changed,ld_ko_o)
+ifdef KBUILD_MODPOST_TOOL
+   $(call if_changed,arch_modpost)
+endif
 
 targets += $(modules)
 
-- 
2.8.1



[PATCH 2/5] kbuild: allow archs to select build for link dead code/data elimination

2016-08-05 Thread Nicholas Piggin
Introduce LINKER_DCE option for architectures to select if they want
to build with -ffunction-sections, -fdata-sections, and link with
--gc-sections. It requires some work (documented) to ensure all
unreferenced entrypoints are live, and requires toolchain and
build verification, so it is made a per-arch option for now.

On a random powerpc64le build, this yelds a significant size saving,
it boots and runs fine, but there is a lot I haven't tested as yet,
so these savings may be reduced if there are bugs in the link.

text  databssdec   filename
11169741   11807441923176   14273661   vmlinux
10445269   10041271919707   13369103   vmlinux.dce

~700K text, ~170K data, 6% removed from kernel image size.

Signed-off-by: Nicholas Piggin 
---
 Makefile  | 10 
 arch/Kconfig  | 13 ++
 include/asm-generic/vmlinux.lds.h | 52 ++-
 include/linux/compiler.h  | 18 ++
 include/linux/export.h| 30 +++---
 include/linux/init.h  | 38 ++--
 init/Makefile |  2 ++
 7 files changed, 100 insertions(+), 63 deletions(-)

diff --git a/Makefile b/Makefile
index b409076..d5ef31a 100644
--- a/Makefile
+++ b/Makefile
@@ -618,6 +618,11 @@ include arch/$(SRCARCH)/Makefile
 
 KBUILD_CFLAGS  += $(call cc-option,-fno-delete-null-pointer-checks,)
 
+ifdef CONFIG_LINKER_DCE
+KBUILD_CFLAGS  += $(call cc-option,-ffunction-sections,)
+KBUILD_CFLAGS  += $(call cc-option,-fdata-sections,)
+endif
+
 ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
 KBUILD_CFLAGS  += -Os $(call cc-disable-warning,maybe-uninitialized,)
 else
@@ -819,6 +824,11 @@ LDFLAGS_BUILD_ID = $(patsubst -Wl$(comma)%,%,\
 KBUILD_LDFLAGS_MODULE += $(LDFLAGS_BUILD_ID)
 LDFLAGS_vmlinux += $(LDFLAGS_BUILD_ID)
 
+ifdef CONFIG_LINKER_DCE
+# LDFLAGS_MODULE   += $(call ld-option, --gc-sections,)
+LDFLAGS_vmlinux+= $(call ld-option, --gc-sections,)
+endif
+
 ifeq ($(CONFIG_STRIP_ASM_SYMS),y)
 LDFLAGS_vmlinux+= $(call ld-option, -X,)
 endif
diff --git a/arch/Kconfig b/arch/Kconfig
index 1330bf4..a49092b 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -430,6 +430,19 @@ config THIN_ARCHIVES
  Select this if the architecture wants to use thin archives
  instead of ld -r to create the built-in.o files.
 
+config LINKER_DCE
+   bool
+   help
+ Select this if the architecture wants to do dead code and
+ data elimination with the linker by compiling with
+ -ffunction-sections -fdata-sections and linking with
+ --gc-sections.
+
+ This requires that the arch annotates or otherwise protects
+ its external entry points from being discarded. Linker scripts
+ must also merge .text.*, .data.*, and .bss.* correctly into
+ output sections.
+
 config HAVE_CONTEXT_TRACKING
bool
help
diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 6a67ab9..a66ffe9 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -196,9 +196,14 @@
*(.dtb.init.rodata) \
VMLINUX_SYMBOL(__dtb_end) = .;
 
-/* .data section */
+/*
+ * .data section
+ * -fdata-sections generates .data.identifier which needs to be pulled in
+ * with .data, but don't want to pull in .data..stuff which has its own
+ * requirements. Same for bss.
+ */
 #define DATA_DATA  \
-   *(.data)\
+   *(.data .data.[0-9a-zA-Z_]*)\
*(.ref.data)\
*(.data..shared_aligned) /* percpu related */   \
MEM_KEEP(init.data) \
@@ -312,76 +317,76 @@
/* Kernel symbol table: Normal symbols */   \
__ksymtab : AT(ADDR(__ksymtab) - LOAD_OFFSET) { \
VMLINUX_SYMBOL(__start___ksymtab) = .;  \
-   *(SORT(___ksymtab+*))   \
+   KEEP(*(SORT(___ksymtab+*))) \
VMLINUX_SYMBOL(__stop___ksymtab) = .;   \
}   \
\
/* Kernel symbol table: GPL-only symbols */ \
__ksymtab_gpl : AT(ADDR(__ksymtab_gpl) - LOAD_OFFSET) { \
VMLINUX_SYMBOL(__start___ksymtab_gpl) = .;  \
-   *(SORT(___ksymtab_gpl+*))   \
+   KEEP(*(SORT(___ksymtab_gpl+*))) \
VMLINUX_SYMBOL(__stop___ksym

[PATCH 1/5] kbuild: allow architectures to use thin archives instead of ld -r

2016-08-05 Thread Nicholas Piggin
From: Stephen Rothwell 

ld -r is an incremental link used to create built-in.o files in build
subdirectories. It produces relocatable object files containing all
its input files, and these are are then pulled together and relocated
in the final link. Aside from the bloat, this constrains the final
link relocations, which has bitten large powerpc builds with
unresolvable relocations in the final link.

Alan Modra has recommended the kernel use thin archives for linking.
This is an alternative and means that the linker has more information
available to it when it links the kernel.

This patch enables a config option architectures can select, which
causes all built-in.o files to be built as thin archives. built-in.o
files in subdirectories do not get symbol table or index attached,
which improves speed and size. The final link pass creates a
built-in.o archive in the root output directory which includes the
symbol table and index. The linker then uses takes this file to link.

The --whole-archive linker option is required, because the linker now
has visibility to every individual object file, and it will otherwise
just completely avoid including those without external references
(consider a file with EXPORT_SYMBOL or initcall or hardware exceptions
as its only entry points). The traditional built works "by luck" as
built-in.o files are large enough that they're going to get external
references. However this optimisation is unpredictable for the kernel
(due to above external references), ineffective at culling unused, and
costly because the .o files have to be searched for references.
Superior alternatives for link-time culling should be used instead.

Build characteristics for inclink vs thinarc, on a small powerpc64le
pseries VM with a modest .config:

  inclink   thinarc
sizes
vmlinux15 618 68015 625 028
sum of all built-in.o  56 091 808 1 054 334
sum excluding root built-in.o   151 430

find -name built-in.o | xargs rm ; time make vmlinux
real  22.772s   21.143s
user  13.280s   13.430s
sys4.310s2.750s

- Final kernel pulled in only about 6K more, which shows how
  ineffective the object file culling is.
- Build performance looks improved due to less pagecache activity.
  On IO constrained systems it could be a bigger win.
- Build size saving is significant.

Side note, the toochain understands archives, so there's some tricks,
$ ar t built-in.o  # list all files you linked with
$ size built-in.o  # and their sizes
$ objdump -d built-in.o# disassembly (unrelocated) with filenames

Implementation by sfr, minor tweaks by npiggin.

Cc: linux-kbu...@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Arnd Bergmann 
Cc: Segher Boessenkool 
Cc: Alan Modra 
Signed-off-by: Stephen Rothwell 
Signed-off-by: Nicholas Piggin 
---
 arch/Kconfig|  6 +
 scripts/Makefile.build  | 23 +---
 scripts/link-vmlinux.sh | 71 +
 3 files changed, 85 insertions(+), 15 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index d794384..1330bf4 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -424,6 +424,12 @@ config CC_STACKPROTECTOR_STRONG
 
 endchoice
 
+config THIN_ARCHIVES
+   bool
+   help
+ Select this if the architecture wants to use thin archives
+ instead of ld -r to create the built-in.o files.
+
 config HAVE_CONTEXT_TRACKING
bool
help
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 0d1ca5b..7fab825 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -358,12 +358,22 @@ $(sort $(subdir-obj-y)): $(subdir-ym) ;
 # Rule to compile a set of .o files into one .o file
 #
 ifdef builtin-target
-quiet_cmd_link_o_target = LD  $@
+
+ifdef CONFIG_THIN_ARCHIVES
+  cmd_make_builtin = rm -f $@; $(AR) rcST$(KBUILD_ARFLAGS)
+  cmd_make_empty_builtin = rm -f $@; $(AR) rcST$(KBUILD_ARFLAGS)
+  quiet_cmd_link_o_target = AR  $@
+else
+  cmd_make_builtin = $(LD) $(ld_flags) -r -o
+  cmd_make_empty_builtin = rm -f $@; $(AR) rcs$(KBUILD_ARFLAGS)
+  quiet_cmd_link_o_target = LD  $@
+endif
+
 # If the list of objects to link is empty, just create an empty built-in.o
 cmd_link_o_target = $(if $(strip $(obj-y)),\
- $(LD) $(ld_flags) -r -o $@ $(filter $(obj-y), $^) \
+ $(cmd_make_builtin) $@ $(filter $(obj-y), $^) \
  $(cmd_secanalysis),\
- rm -f $@; $(AR) rcs$(KBUILD_ARFLAGS) $@)
+ $(cmd_make_empty_builtin) $@)
 
 $(builtin-target): $(obj-y) FORCE
$(call if_changed,link_o_target)
@@ -389,7 +399,12 @@ $(modorder-target): $(subdir-ym) FORCE
 #
 ifdef lib-target
 quiet_cmd_link_l_target = AR  $@
-cmd_link_l_target = rm -f $@; $(AR) rcs$(KB

[RFC][PATCH 0/5] kbuild changes, thin archives, --gc-sections

2016-08-05 Thread Nicholas Piggin
Hello,

I have 3 different things in this patchset. All arch specific, but all
involve kbuild changes, so I'd like to discuss them with kbuild
maintainers. The goal has been to improve long standing linking
difficulties with the powerpc kernel.

* First, building kernel using thin archives rather than incremental
  linking. This seems quite clean and is per-arch, so I hope it should
  not be too controversial.

* Second, building kernel using -ffunction-sections -fdata-sections,
  --gc-sections. Yes, I'm spinning the wheel again. It was motivated
  by tiny codesize regression in the first patch, but the results seem
  too good to ignore.

* Third, allowing architecture to run a tool over module after it has
  been linked. Powerpc wants to use it in order to relocate "alternate
  code" instructions that get don't get linked at their runtime
  address. No idea if this is the right approach wrt kbuild, but it
  seems to work.

I have included the powerpc code for the first two as a reference. The
third is much bigger and mostly uninteresting for this cc list, but it
can be found here:

 https://patchwork.ozlabs.org/patch/651006/

Comments appreciated.

Thanks,



[PATCH v3 2/2] powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.

2016-08-05 Thread Mahesh J Salgaonkar
From: Mahesh Salgaonkar 

The current implementation of MCE early handling modifies CR0/1 registers
without saving its old values. Fix this by moving early check for
powersaving mode to machine_check_handle_early().

The power architecture 2.06 or later allows the possibility of getting
machine check while in nap/sleep/winkle. The last bit of HSPRG0 is set
to 1, if thread is woken up from winkle. Hence, clear the last bit of
HSPRG0 (r13) before MCE handler starts using it as paca pointer.

Also, the current code always puts the thread into nap state irrespective
of whatever idle state it woke up from. Fix that by looking at
paca->thread_idle_state and put the thread back into same state where it
came from.

Reported-by: Paul Mackerras 
Signed-off-by: Mahesh Salgaonkar 
Reviewed-by: Shreyas B. Prabhu 
---
Change in v3:
- Rebase to Linus' master.

Change in v2:
- Call IDLE_STATE_ENTER_SEQ(PPC_NAP) instead of power7_enter_nap_mode()
  to be consistent with other part of code.
---
 arch/powerpc/kernel/exceptions-64s.S |   69 --
 1 file changed, 40 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 694def6..a59c9cc 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -144,29 +144,14 @@ machine_check_pSeries_1:
 * vector
 */
SET_SCRATCH0(r13)   /* save r13 */
-#ifdef CONFIG_PPC_P7_NAP
-BEGIN_FTR_SECTION
-   /* Running native on arch 2.06 or later, check if we are
-* waking up from nap. We only handle no state loss and
-* supervisor state loss. We do -not- handle hypervisor
-* state loss at this time.
+   /*
+* Running native on arch 2.06 or later, we may wakeup from winkle
+* inside machine check. If yes, then last bit of HSPGR0 would be set
+* to 1. Hence clear it unconditionally.
 */
-   mfspr   r13,SPRN_SRR1
-   rlwinm. r13,r13,47-31,30,31
-   OPT_GET_SPR(r13, SPRN_CFAR, CPU_FTR_CFAR)
-   beq 9f
-
-   mfspr   r13,SPRN_SRR1
-   rlwinm. r13,r13,47-31,30,31
-   /* waking up from powersave (nap) state */
-   cmpwi   cr1,r13,2
-   /* Total loss of HV state is fatal. let's just stay stuck here */
-   OPT_GET_SPR(r13, SPRN_CFAR, CPU_FTR_CFAR)
-   bgt cr1,.
-9:
-   OPT_SET_SPR(r13, SPRN_CFAR, CPU_FTR_CFAR)
-END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
-#endif /* CONFIG_PPC_P7_NAP */
+   GET_PACA(r13)
+   clrrdi  r13,r13,1
+   SET_PACA(r13)
EXCEPTION_PROLOG_0(PACA_EXMC)
 BEGIN_FTR_SECTION
b   machine_check_powernv_early
@@ -1273,25 +1258,51 @@ machine_check_handle_early:
 * Check if thread was in power saving mode. We come here when any
 * of the following is true:
 * a. thread wasn't in power saving mode
-* b. thread was in power saving mode with no state loss or
-*supervisor state loss
+* b. thread was in power saving mode with no state loss,
+*supervisor state loss or hypervisor state loss.
 *
-* Go back to nap again if (b) is true.
+* Go back to nap/sleep/winkle mode again if (b) is true.
 */
rlwinm. r11,r12,47-31,30,31 /* Was it in power saving mode? */
beq 4f  /* No, it wasn;t */
/* Thread was in power saving mode. Go back to nap again. */
cmpwi   r11,2
-   bne 3f
-   /* Supervisor state loss */
+   blt 3f
+   /* Supervisor/Hypervisor state loss */
li  r0,1
stb r0,PACA_NAPSTATELOST(r13)
 3: bl  machine_check_queue_event
MACHINE_CHECK_HANDLER_WINDUP
GET_PACA(r13)
ld  r1,PACAR1(r13)
-   li  r3,PNV_THREAD_NAP
-   b   pnv_enter_arch207_idle_mode
+   /*
+* Check what idle state this CPU was in and go back to same mode
+* again.
+*/
+   lbz r3,PACA_THREAD_IDLE_STATE(r13)
+   cmpwi   r3,PNV_THREAD_NAP
+   bgt 10f
+   IDLE_STATE_ENTER_SEQ(PPC_NAP)
+   /* No return */
+10:
+   cmpwi   r3,PNV_THREAD_SLEEP
+   bgt 2f
+   IDLE_STATE_ENTER_SEQ(PPC_SLEEP)
+   /* No return */
+
+2:
+   /*
+* Go back to winkle. Please note that this thread was woken up in
+* machine check from winkle and have not restored the per-subcore
+* state. Hence before going back to winkle, set last bit of HSPGR0
+* to 1. This will make sure that if this thread gets woken up
+* again at reset vector 0x100 then it will get chance to restore
+* the subcore state.
+*/
+   ori r13,r13,1
+   SET_PACA(r13)
+   IDLE_STATE_ENTER_SEQ(PPC_WINKLE)
+   /* No return */
 4:
 #endif
/*



[PATCH v3 1/2] powernv: Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h

2016-08-05 Thread Mahesh J Salgaonkar
From: Mahesh Salgaonkar 

Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h so that MCE handler changes
in subsequent patch can use it.

No functionality change.

Signed-off-by: Mahesh Salgaonkar 
---
Change in v3:
- Rebase to Linus' master.
---
 arch/powerpc/include/asm/cpuidle.h |   13 +
 arch/powerpc/kernel/idle_book3s.S  |   12 
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/cpuidle.h 
b/arch/powerpc/include/asm/cpuidle.h
index 3d7fc06..01b8a13 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -19,4 +19,17 @@ extern u64 pnv_first_deep_stop_state;
 
 #endif
 
+/* Idle state entry routines */
+#ifdef CONFIG_PPC_P7_NAP
+#defineIDLE_STATE_ENTER_SEQ(IDLE_INST) \
+   /* Magic NAP/SLEEP/WINKLE mode enter sequence */\
+   std r0,0(r1);   \
+   ptesync;\
+   ld  r0,0(r1);   \
+1: cmp cr0,r0,r0;  \
+   bne 1b; \
+   IDLE_INST;  \
+   b   .
+#endif /* CONFIG_PPC_P7_NAP */
+
 #endif
diff --git a/arch/powerpc/kernel/idle_book3s.S 
b/arch/powerpc/kernel/idle_book3s.S
index 8a56a51..7a41f13 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -44,18 +44,6 @@
PSSCR_PSLL_MASK | PSSCR_TR_MASK | \
PSSCR_MTL_MASK
 
-/* Idle state entry routines */
-
-#defineIDLE_STATE_ENTER_SEQ(IDLE_INST) \
-   /* Magic NAP/SLEEP/WINKLE mode enter sequence */\
-   std r0,0(r1);   \
-   ptesync;\
-   ld  r0,0(r1);   \
-1: cmp cr0,r0,r0;  \
-   bne 1b; \
-   IDLE_INST;  \
-   b   .
-
.text
 
 /*



[PATCH v2] cxl: Use fixed width predefined types in data structure.

2016-08-05 Thread Philippe Bergheaud
This patch fixes a regression introduced by commit b810253.

It substitutes the type __u8 to u8 in the uapi header cxl.h,
because the latter is not always defined in userland build
environments, in particular when cross-compiling libcxl on
x86_64 linux machines (RHEL6.7 and Ubuntu 16.04).

This patch also changes the size of the field data_size, and
makes it constant, to support 32-bit userland applications
running on big-endian ppc64 kernels transparently.

This breaks the (young) API that has been merged in v4.8.

Signed-off-by: Philippe Bergheaud 
---
Changes since v1:
  Added an explanation for the proposed API change in the log.

Note:
As far as I know, cxlflash is the only known user of the API.

 include/uapi/misc/cxl.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/uapi/misc/cxl.h b/include/uapi/misc/cxl.h
index cbae529..180d526 100644
--- a/include/uapi/misc/cxl.h
+++ b/include/uapi/misc/cxl.h
@@ -136,8 +136,8 @@ struct cxl_event_afu_driver_reserved {
 *
 * Of course the contents will be ABI, but that's up the AFU driver.
 */
-   size_t data_size;
-   u8 data[];
+   __u32 data_size;
+   __u8 data[];
 };
 
 struct cxl_event {
-- 
2.8.0



Re: [PATCH] powerpc: Align hot loops of memset() and backwards_memcpy()

2016-08-05 Thread Anton Blanchard
Hi Nick,

> Hmm. If we execute this loop once, we'll only fetch additional nops.
> Twice, and we make up for them by not fetching unused instructions.
> More than twice and we may start winning.
> 
> For large sizes it probably helps, but I'd like to see what sizes
> memset sees.

I found this in a trace of nginx web serving. Looking back at it,
get_empty_filp() zeros a struct file, and we go through the loop 4
times. We might want to look more generally at what lengths memset() is
called with though.

Anton


[v5.2] ucc_slow: Fix to avoid IS_ERR_VALUE abuses and dead code on 64bit systems.

2016-08-05 Thread Arvind Yadav
IS_ERR_VALUE() assumes that parameter is an unsigned long.
It can not be used to check if 'unsigned int' is passed insted.
Which tends to reflect an error.
In 64bit architectures sizeof (int) == 4 && sizeof (long) == 8.
IS_ERR_VALUE(x) is ((x) >= (unsigned long)-4095).
IS_ERR_VALUE() of 'unsigned int' is always false because the 32bit
value is zero extended to 64 bits.

Now Problem In UCC slow protocols -: drivers/soc/fsl/qe/ucc_slow.c

/* Get PRAM base */
uccs->us_pram_offset =
   qe_muram_alloc(UCC_SLOW_PRAM_SIZE, ALIGNMENT_OF_UCC_SLOW_PRAM);
if (IS_ERR_VALUE(uccs->us_pram_offset)) {
   printk(KERN_ERR "%s: cannot allocate MURAM for PRAM", __func__);
   ucc_slow_free(uccs);
   return -ENOMEM;
}
id = ucc_slow_get_qe_cr_subblock(us_info->ucc_num);
qe_issue_cmd(QE_ASSIGN_PAGE_TO_DEVICE, id, us_info->protocol,
 uccs->us_pram_offset);

uccs->us_pram = qe_muram_addr(uccs->us_pram_offset);

/* Allocate BDs. */
uccs->rx_base_offset =
qe_muram_alloc(us_info->rx_bd_ring_len * sizeof(struct qe_bd),
QE_ALIGNMENT_OF_BD);
if (IS_ERR_VALUE(uccs->rx_base_offset)) {
printk(KERN_ERR "%s: cannot allocate %u RX BDs\n", __func__,
us_info->rx_bd_ring_len);
uccs->rx_base_offset = 0;
ucc_slow_free(uccs);
return -ENOMEM;
}

uccs->tx_base_offset =
 qe_muram_alloc(us_info->tx_bd_ring_len * sizeof(struct qe_bd),
QE_ALIGNMENT_OF_BD);
if (IS_ERR_VALUE(uccs->tx_base_offset)) {
 printk(KERN_ERR "%s: cannot allocate TX BDs", __func__);
 uccs->tx_base_offset = 0;
 ucc_slow_free(uccs);
 return -ENOMEM;
}

qe_muram_alloc (a.k.a. cpm_muram_alloc) returns unsigned long.
Return value store in a u32 (us_pram_offset, rx_base_offset
and tx_base_offset).If qe_muram_alloc will return any error,
Then IS_ERR_VALUE will always return 0. it'll not call ucc_fast_free
for any failure. Inside 'if code' will be a dead code on 64bit.
Even  qe_muram_addr will return wrong virtual address. Which
can cause an error.
This patch is to avoid this problem on 64bit machine.

Signed-off-by: Arvind Yadav 
---
 include/soc/fsl/qe/ucc_slow.h | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/soc/fsl/qe/ucc_slow.h b/include/soc/fsl/qe/ucc_slow.h
index 6c0573a..fca30a1 100644
--- a/include/soc/fsl/qe/ucc_slow.h
+++ b/include/soc/fsl/qe/ucc_slow.h
@@ -189,7 +189,7 @@ struct ucc_slow_private {
struct ucc_slow_info *us_info;
struct ucc_slow __iomem *us_regs; /* Ptr to memory map of UCC regs */
struct ucc_slow_pram *us_pram;  /* a pointer to the parameter RAM */
-   u32 us_pram_offset;
+   unsigned long us_pram_offset;
int enabled_tx; /* Whether channel is enabled for Tx (ENT) */
int enabled_rx; /* Whether channel is enabled for Rx (ENR) */
int stopped_tx; /* Whether channel has been stopped for Tx
@@ -198,8 +198,12 @@ struct ucc_slow_private {
struct list_head confQ; /* frames passed to chip waiting for tx */
u32 first_tx_bd_mask;   /* mask is used in Tx routine to save status
   and length for first BD in a frame */
-   u32 tx_base_offset; /* first BD in Tx BD table offset (In MURAM) */
-   u32 rx_base_offset; /* first BD in Rx BD table offset (In MURAM) */
+   unsigned long tx_base_offset;   /* first BD in Tx BD table offset
+* (In MURAM)
+*/
+   unsigned long rx_base_offset;   /* first BD in Rx BD table offset
+* (In MURAM)
+*/
struct qe_bd *confBd;   /* next BD for confirm after Tx */
struct qe_bd *tx_bd;/* next BD for new Tx request */
struct qe_bd *rx_bd;/* next BD to collect after Rx */
-- 
1.9.1



[pasemi] Internal CompactFlash (CF) card device not recognised after the powerpc-4.8-1 merge

2016-08-05 Thread Christian Zigotzky

Hi All,

The internal PASEMI CompactFlash (CF) card device doesn't work anymore 
after the powerpc-4.8-1 merge. That means the code for the internal CF 
card device in the Nemo patch doesn't work after the first PowerPC 
merge. The CompactFlash (CF) card slot is wired to the CPU local bus. It 
is typically used to hold the Linux kernel. I know it isn't well to use 
an own patch for that but I think it is a good time to integrate the 
PASEMI internal CompactFlash (CF) card device to the official kernel. 
What do you think? I am not a programmer so I can't integrate the source 
code for the internal CF card device. But maybe you can take the patch 
and integrate it.


We use the following patch for the kernel 4.7:

diff -rupN a/drivers/ata/pata_of_platform.c 
b/drivers/ata/pata_of_platform.c
--- a/drivers/ata/pata_of_platform.c   2016-08-05 
09:58:41.410569036 +0200
+++ b/drivers/ata/pata_of_platform.c   2016-08-05 
09:59:54.41424 +0200

@@ -41,14 +41,36 @@ static int pata_of_platform_probe(struct
   return -EINVAL;
}

-   ret = of_address_to_resource(dn, 1, &ctl_res);
-   if (ret) {
-  dev_err(&ofdev->dev, "can't get CTL address from "
- "device tree\n");
-  return -EINVAL;
+   if (of_device_is_compatible(dn, "electra-ide")) {
+  /* Altstatus is really at offset 0x3f6 from the primary window
+   * on electra-ide. Adjust ctl_res and io_res accordingly.
+   */
+  ctl_res = io_res;
+  ctl_res.start = ctl_res.start+0x3f6;
+  io_res.end = ctl_res.start-1;
+
+#ifdef CONFIG_PPC_PASEMI_SB600
+   } else if (of_device_is_compatible(dn, "electra-cf")) {
+   /* Task regs are at 0x800, with alt status @ 0x80e 
in the primary window
+* on electra-cf. Adjust ctl_res and io_res 
accordingly.

+*/
+   ctl_res = io_res;
+   io_res.start += 0x800;
+   ctl_res.start = ctl_res.start + 0x80e;
+   io_res.end = ctl_res.start-1;
+#endif
+   } else {
+  ret = of_address_to_resource(dn, 1, &ctl_res);
+  if (ret) {
+ dev_err(&ofdev->dev, "can't get CTL address from "
+"device tree\n");
+ return -EINVAL;
+  }
}

irq_res = platform_get_resource(ofdev, IORESOURCE_IRQ, 0);
+   if (irq_res)
+  irq_res->flags = 0;

prop = of_get_property(dn, "reg-shift", NULL);
if (prop)
@@ -65,6 +87,11 @@ static int pata_of_platform_probe(struct
   dev_info(&ofdev->dev, "pio-mode unspecified, assuming PIO0\n");
}

+#ifdef CONFIG_PPC_PASEMI_SB600
+   irq_res = 0;// force irq off (doesn't 
seem to work)

+#endif
+
+
pio_mask = 1 << pio_mode;
pio_mask |= (1 << pio_mode) - 1;

@@ -74,7 +101,11 @@ static int pata_of_platform_probe(struct

 static struct of_device_id pata_of_platform_match[] = {
{ .compatible = "ata-generic", },
-   { },
+   { .compatible = "electra-ide", },
+#ifdef CONFIG_PPC_PASEMI_SB600
+   { .compatible = "electra-cf",},
+#endif
+   {},
 };
 MODULE_DEVICE_TABLE(of, pata_of_platform_match);

dmesg with the kernel 4.7:

zcat /var/log/dmesg.1.gz | grep -i ata7

[2.939788] ata7: PATA max PIO0 no IRQ, using PIO polling mmio 
cmd 0xf800 ctl 0xf80e

[3.099186] ata7.00: CFA: SanDisk SDCFB-256, HDX 2.33, max PIO4
[3.099191] ata7.00: 501760 sectors, multi 0: LBA
[3.099199] ata7.00: configured for PIO

The dmesg of the latest Git kernel doesn't have any output of our 
internal CF card device.


Could you please integrate our PASEMI CF card device again?

Thanks,

Christian



Re: Problems with Kernels 3.17-rc1 and onwards on Acube Sam460 AMCC 460ex board

2016-08-05 Thread Julian Margetson

On 2/18/2015 10:56 PM, Michael Ellerman wrote:

On Wed, 2015-02-18 at 21:36 -0400, Julian Margetson wrote:

On 2/18/2015 8:13 PM, Michael Ellerman wrote:


On Wed, 2015-02-18 at 15:45 -0400, Julian Margetson wrote:

On 2/15/2015 8:18 PM, Michael Ellerman wrote:


On Sun, 2015-02-15 at 08:16 -0400, Julian Margetson wrote:

Hi

I am unable to get any kernel beyond  the 3.16 branch working on an
Acube Sam460ex
  AMCC 460ex based motherboard. Kernel  up 3.16.7-ckt6 working.

Does reverting b0345bbc6d09 change anything?


[6.364350] snd_hda_intel 0001:81:00.1: enabling device ( -> 0002)
[6.453794] snd_hda_intel 0001:81:00.1: ppc4xx_setup_msi_irqs: fail mapping 
irq
[6.487530] Unable to handle kernel paging request for data at address 
0x0fa06c7c
[6.495055] Faulting instruction address: 0xc032202c
[6.500033] Vector: 300 (Data Access) at [efa31cf0]
[6.504922] pc: c032202c: __reg_op+0xe8/0x100
[6.509697] lr: c0014f88: msi_bitmap_free_hwirqs+0x50/0x94
[6.515600] sp: efa31da0
[6.518491]msr: 21000
[6.521112]dar: fa06c7c
[6.523915]  dsisr: 0
[6.526190]   current = 0xef8bab00
[6.529603] pid   = 115, comm = kworker/0:1
[6.534163] enter ? for help
[6.537054] [link register   ] c0014f88 msi_bitmap_free_hwirqs+0x50/0x94
[6.543811] [efa31da0] c0014f78 msi_bitmap_free_hwirqs+0x40/0x94 (unreliable)
[6.551001] [efa31dc0] c001aee8 ppc4xx_setup_msi_irqs+0xac/0xf4
[6.556973] [efa31e00] c03503a4 pci_enable_msi_range+0x1e0/0x280
[6.563032] [efa31e40] f92c2f74 azx_probe_work+0xe0/0x57c [snd_hda_intel]
[6.569906] [efa31e80] c0036344 process_one_work+0x1e8/0x2f0
[6.575627] [efa31eb0] c003677c worker_thread+0x2f4/0x438
[6.581079] [efa31ef0] c003a3e4 kthread+0xc8/0xcc
[6.585844] [efa31f40] c000aec4 ret_from_kernel_thread+0x5c/0x64
[6.591910] mon>  

Managed to do a third git bisect  with the following results .

Great work.


git bisect bad
9279d3286e10736766edcaf815ae10e00856e448 is the first bad commit
commit 9279d3286e10736766edcaf815ae10e00856e448
Author: Rasmus Villemoes 
Date:   Wed Aug 6 16:10:16 2014 -0700

 lib: bitmap: change parameter of bitmap_*_region to unsigned

So the bug is in the 4xx MSI code, and has always been there, in fact I don't
see how that code has *ever* worked. The commit you bisected to just caused the
existing bug to cause an oops.

Can you try this?

diff --git a/arch/powerpc/sysdev/ppc4xx_msi.c b/arch/powerpc/sysdev/ppc4xx_msi.c
index 6e2e6aa378bb..effb5b878a78 100644
--- a/arch/powerpc/sysdev/ppc4xx_msi.c
+++ b/arch/powerpc/sysdev/ppc4xx_msi.c
@@ -95,11 +95,9 @@ static int ppc4xx_setup_msi_irqs(struct pci_dev *dev, int 
nvec, int type)
  
  	list_for_each_entry(entry, &dev->msi_list, list) {

int_no = msi_bitmap_alloc_hwirqs(&msi_data->bitmap, 1);
-   if (int_no >= 0)
-   break;
if (int_no < 0) {
-   pr_debug("%s: fail allocating msi interrupt\n",
-   __func__);
+   pr_warn("%s: fail allocating msi interrupt\n", 
__func__);
+   return -ENOSPC;
}
virq = irq_of_parse_and_map(msi_data->msi_dev, int_no);
if (virq == NO_IRQ) {


Thanks.
This works with 3.17-rc1. Will try with the 3.18 Branch .

OK great.


Any ideas why drm is not  working ? (It never worked) .

No sorry. You might have more luck if you post a new thread to the dri list.


[5.809802] Linux agpgart interface v0.103
[6.137893] [drm] Initialized drm 1.1.0 20060810
[6.439872] snd_hda_intel 0001:81:00.1: enabling device ( -> 0002)
[6.508544] ppc4xx_setup_msi_irqs: fail allocating msi interrupt

I'm curious why it's failing to allocate MSIs. Possibly it's just run out.

Can you post the output of 'cat /proc/interrupts'?

cheers





Hi Michael

Any chance of this fix being added to the mainline ?


Regards

Julian




[PATCH] powerpc/40x: Clear MSR_DR in one insn instead of two

2016-08-05 Thread Christophe Leroy
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/misc_32.S | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index d9c912b..e025230 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -243,8 +243,7 @@ _GLOBAL(_nmask_and_or_msr)
  */
 _GLOBAL(real_readb)
mfmsr   r7
-   ori r0,r7,MSR_DR
-   xorir0,r0,MSR_DR
+   rlwinm  r0,r7,0,~MSR_DR
sync
mtmsr   r0
sync
@@ -261,8 +260,7 @@ _GLOBAL(real_readb)
  */
 _GLOBAL(real_writeb)
mfmsr   r7
-   ori r0,r7,MSR_DR
-   xorir0,r0,MSR_DR
+   rlwinm  r0,r7,0,~MSR_DR
sync
mtmsr   r0
sync
-- 
2.1.0



[PATCH] powerpc/32: Remove CLR_TOP32

2016-08-05 Thread Christophe Leroy
CLR_TOP32() is defined as blank. Last useful instance of CLR_TOP32()
was removed by commit 40ef8cbc6d360 ("powerpc: Get 64-bit configs to
compile with ARCH=powerpc")

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/ppc_asm.h | 1 -
 arch/powerpc/kernel/entry_32.S | 1 -
 arch/powerpc/kernel/head_32.S  | 3 ---
 arch/powerpc/kernel/head_8xx.S | 1 -
 4 files changed, 6 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc_asm.h 
b/arch/powerpc/include/asm/ppc_asm.h
index d5d5b5e..bcd891f 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -527,7 +527,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_601)
 #endif
 #define MTMSRD(r)  mtmsr   r
 #define MTMSR_EERI(reg)mtmsr   reg
-#define CLR_TOP32(r)
 #endif
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 9899032..83428a2 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -654,7 +654,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_SPE)
 #endif /* CONFIG_SMP */
 
tophys(r0,r4)
-   CLR_TOP32(r0)
mtspr   SPRN_SPRG_THREAD,r0 /* Update current THREAD phys addr */
lwz r1,KSP(r4)  /* Load new stack pointer */
 
diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
index dc0488b..a3f821e 100644
--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -266,7 +266,6 @@ __secondary_hold_acknowledge:
 
 
 #define EXCEPTION_PROLOG_2 \
-   CLR_TOP32(r11); \
stw r10,_CCR(r11);  /* save registers */ \
stw r12,GPR12(r11); \
stw r9,GPR9(r11);   \
@@ -862,7 +861,6 @@ __secondary_start:
/* ptr to phys current thread */
tophys(r4,r2)
addir4,r4,THREAD/* phys address of our thread_struct */
-   CLR_TOP32(r4)
mtspr   SPRN_SPRG_THREAD,r4
li  r3,0
mtspr   SPRN_SPRG_RTAS,r3   /* 0 => not in RTAS */
@@ -949,7 +947,6 @@ start_here:
/* ptr to phys current thread */
tophys(r4,r2)
addir4,r4,THREAD/* init task's THREAD */
-   CLR_TOP32(r4)
mtspr   SPRN_SPRG_THREAD,r4
li  r3,0
mtspr   SPRN_SPRG_RTAS,r3   /* 0 => not in RTAS */
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 43ddaae..3a185c5 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -151,7 +151,6 @@ turn_on_mmu:
 
 
 #define EXCEPTION_PROLOG_2 \
-   CLR_TOP32(r11); \
stw r10,_CCR(r11);  /* save registers */ \
stw r12,GPR12(r11); \
stw r9,GPR9(r11);   \
-- 
2.1.0



[PATCH] powerpc/32: Remove one insn in __bswapdi2

2016-08-05 Thread Christophe Leroy
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/misc_32.S | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index e025230..e18055c 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -578,9 +578,8 @@ _GLOBAL(__bswapdi2)
rlwimi  r9,r4,24,0,7
rlwimi  r10,r3,24,0,7
rlwimi  r9,r4,24,16,23
-   rlwimi  r10,r3,24,16,23
+   rlwimi  r4,r3,24,16,23
mr  r3,r9
-   mr  r4,r10
blr
 
 #ifdef CONFIG_SMP
-- 
2.1.0



Re: [PATCH] powerpc: Align hot loops of memset() and backwards_memcpy()

2016-08-05 Thread Nicholas Piggin
On Thu,  4 Aug 2016 16:53:22 +1000
Anton Blanchard  wrote:

> From: Anton Blanchard 
> 
> Align the hot loops in our assembly implementation of memset()
> and backwards_memcpy().
> 
> backwards_memcpy() is called from tcp_v4_rcv(), so we might
> want to optimise this a little more.
> 
> Signed-off-by: Anton Blanchard 
> ---
>  arch/powerpc/lib/mem_64.S | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/lib/mem_64.S b/arch/powerpc/lib/mem_64.S
> index 43435c6..eda7a96 100644
> --- a/arch/powerpc/lib/mem_64.S
> +++ b/arch/powerpc/lib/mem_64.S
> @@ -37,6 +37,7 @@ _GLOBAL(memset)
>   clrldi  r5,r5,58
>   mtctr   r0
>   beq 5f
> + .balign 16
>  4:   std r4,0(r6)
>   std r4,8(r6)
>   std r4,16(r6)

Hmm. If we execute this loop once, we'll only fetch additional nops. Twice, and
we make up for them by not fetching unused instructions. More than twice and we
may start winning.

For large sizes it probably helps, but I'd like to see what sizes memset sees.



Re: [pasemi] Radeon HD graphics card not recognised after the powerpc-4.8-1 commit

2016-08-05 Thread Michael Ellerman
Christian Zigotzky  writes:

> Hi Michael,
>
> Thanks a million for your patch! :-)

No worries :)

> @All
> Keep your fingers crossed!
>
> 1. git clone 
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a

Normally that would be:

$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
linux

And this:

> 2. patch -p0 < powerpc-pci-Only-do-fixed-PHB-numbering-on-powernv.patch

Would be:

$ cd linux
$ patch -p1 < powerpc-pci-Only-do-fixed-PHB-numbering-on-powernv.patch

> 3. patch -p0 < nemo_4.8-3.patch
>
> 4. yes "" | make oldconfig

And that can be done with 'make olddefconfig'.

cheers


[GIT PULL] Please pull powerpc/linux.git powerpc-4.8-2 tag

2016-08-05 Thread Michael Ellerman
Hi Linus,

Please pull some more powerpc updates for 4.8.

These were delayed for various reasons, so I let them sit in next a bit
longer, rather than including them in my first pull request.

There's one conflict in kernel/jump_label.c, the resolution is simply to
take both sides changes.

The following changes since commit bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9:

  Merge tag 'powerpc-4.8-1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux (2016-07-30 
21:01:36 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-4.8-2

for you to fetch changes up to eea8148c69f3aecbf297b12943a591467a1fb432:

  powerpc/mm: Move register_process_table() out of ppc_md (2016-08-04 20:22:34 
+1000)


powerpc updates for 4.8 #2

Fixes:
 - Fix early access to cpu_spec relocation from Benjamin Herrenschmidt
 - Fix incorrect event codes in power9-event-list from Madhavan Srinivasan
 - Move register_process_table() out of ppc_md from Michael Ellerman

Use jump_label for [cpu|mmu]_has_feature() from Aneesh Kumar K.V, Kevin Hao and 
Michael Ellerman:
 - Add mmu_early_init_devtree() from Michael Ellerman
 - Move disable_radix handling into mmu_early_init_devtree() from Michael 
Ellerman
 - Do hash device tree scanning earlier from Michael Ellerman
 - Do radix device tree scanning earlier from Michael Ellerman
 - Do feature patching before MMU init from Michael Ellerman
 - Check features don't change after patching from Michael Ellerman
 - Make MMU_FTR_RADIX a MMU family feature from Aneesh Kumar K.V
 - Convert mmu_has_feature() to returning bool from Michael Ellerman
 - Convert cpu_has_feature() to returning bool from Michael Ellerman
 - Define radix_enabled() in one place & use static inline from Michael Ellerman
 - Add early_[cpu|mmu]_has_feature() from Michael Ellerman
 - Convert early cpu/mmu feature check to use the new helpers from Aneesh Kumar 
K.V
 - jump_label: Make it possible for arches to invoke jump_label_init() earlier 
from Kevin Hao
 - Call jump_label_init() in apply_feature_fixups() from Aneesh Kumar K.V
 - Remove mfvtb() from Kevin Hao
 - Move cpu_has_feature() to a separate file from Kevin Hao
 - Add kconfig option to use jump labels for cpu/mmu_has_feature() from Michael 
Ellerman
 - Add option to use jump label for cpu_has_feature() from Kevin Hao
 - Add option to use jump label for mmu_has_feature() from Kevin Hao
 - Catch usage of cpu/mmu_has_feature() before jump label init from Aneesh 
Kumar K.V
 - Annotate jump label assembly from Michael Ellerman

TLB flush enhancements from Aneesh Kumar K.V:
 - radix: Implement tlb mmu gather flush efficiently
 - Add helper for finding SLBE LLP encoding
 - Use hugetlb flush functions
 - Drop multiple definition of mm_is_core_local
 - radix: Add tlb flush of THP ptes
 - radix: Rename function and drop unused arg
 - radix/hugetlb: Add helper for finding page size
 - hugetlb: Add flush_hugetlb_tlb_range
 - remove flush_tlb_page_nohash

Add new ptrace regsets from Anshuman Khandual and Simon Guo:
 - elf: Add powerpc specific core note sections
 - Add the function flush_tmregs_to_thread
 - Enable in transaction NT_PRFPREG ptrace requests
 - Enable in transaction NT_PPC_VMX ptrace requests
 - Enable in transaction NT_PPC_VSX ptrace requests
 - Adapt gpr32_get, gpr32_set functions for transaction
 - Enable support for NT_PPC_CGPR
 - Enable support for NT_PPC_CFPR
 - Enable support for NT_PPC_CVMX
 - Enable support for NT_PPC_CVSX
 - Enable support for TM SPR state
 - Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
 - Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
 - Enable support for EBB registers
 - Enable support for Performance Monitor registers


Aneesh Kumar K.V (13):
  powerpc/mm: Make MMU_FTR_RADIX a MMU family feature
  powerpc/mm: Convert early cpu/mmu feature check to use the new helpers
  powerpc: Call jump_label_init() in apply_feature_fixups()
  powerpc/mm: Catch usage of cpu/mmu_has_feature() before jump label init
  powerpc/mm/radix: Implement tlb mmu gather flush efficiently
  powerpc/mm/hash: Add helper for finding SLBE LLP encoding
  powerpc/mm: Use hugetlb flush functions
  powerpc/mm: Drop multiple definition of mm_is_core_local
  powerpc/mm/radix: Add tlb flush of THP ptes
  powerpc/mm/radix: Rename function and drop unused arg
  powerpc/mm/radix/hugetlb: Add helper for finding page size from hstate
  powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range
  powerpc/mm: remove flush_tlb_page_nohash

Anshuman Khandual (15):
  elf: Add powerpc specific core note sections
  powerpc/process: Add the function flush_tmregs_to_thread
  powerpc/ptrace: Enable in transaction NT_PRFPREG ptrace requests
  powerpc/ptrace: Enable in transaction NT_PPC_VMX ptrace requests
  po

[pasemi] Radeon HD graphics card not recognised after the powerpc-4.8-1 commit

2016-08-05 Thread Christian Zigotzky

Hi Michael,

Xorg works!!! :-)

Next step: make modules :-)

Cheers,

Christian

On 05 August 2016 at 11:13 AM, Christian Zigotzky wrote:

Hi Michael,

Thanks a million for your patch! :-)

@All
Keep your fingers crossed!

1. git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a


2. patch -p0 < powerpc-pci-Only-do-fixed-PHB-numbering-on-powernv.patch

3. patch -p0 < nemo_4.8-3.patch

4. yes "" | make oldconfig

5. I activated the following new BTRFS options for Srtest:

CONFIG_BTRFS_FS_POSIX_ACL=y
CONFIG_BTRFS_FS_CHECK_INTEGRITY=y
CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y
# CONFIG_BTRFS_DEBUG is not set
CONFIG_BTRFS_ASSERT=y

6. make vmlinux

...

Cheers,

Christian

On 05 August 2016 at 08:46 AM, Michael Ellerman wrote:

Christian Zigotzky  writes:


Hi All,

I figured out that the Git kernel (4.8) successfully detected my Radeon
HD6870 but Xorg can't access it.

The reason is, that the BusID has changed between the kernel 4.7 and 
4.8.

This should fix it?

   https://patchwork.ozlabs.org/patch/656042/

mpe breaks pasemi again :{

cheers








Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Arnd Bergmann
On Friday, August 5, 2016 6:41:08 PM CEST Nicholas Piggin wrote:
> On Thu, 4 Aug 2016 12:06:41 -0500
> Segher Boessenkool  wrote:
> 
> > On Thu, Aug 04, 2016 at 06:10:57PM +0200, Arnd Bergmann wrote:
> > > On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> > >   
> > > > +   __used  \
> > > > +   __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), 
> > > > used)) \  
> > > 
> > > 
> > > I've just started testing this, but the first problem I ran into
> > > is that @ and # are special characters that have an architecture
> > > specific meaning to the assembler. On ARM, you need "%note @" instead
> > > of "@note #".  
> > 
> > That comment trick (I still feel guilty about it) causes more problems
> > than it solves.  Please don't try to use it :-)
> 
> Yeah that's a funny hack. I don't think it's required though, but I'm just
> running through some more tests.
> 
> I think I found an improvement with the thin archives as well -- we were
> still building symbol table after removing the s option (that only avoids
> index). "S" is required to not build symbol table.
> 
> I'll send out an RFC on a slightly more polished patch series shortly.


I could not find Nico's patches, but based on the information in his
presentation at

https://www.linuxplumbersconf.org/2015/ocw//system/presentations/3369/original/slides.html#(1)

I created a patch for ARM that mirrors what you have for powerpc, see
below.

I have successfully built normal-sized kernels with this (not tried
running them). Unfortunately, the build time for "allyesconfig"
kernel explodes, the final link time is now in the hours instead of
minutes (no exact numbers unfortunately, it takes too long to
reproduce), and I also get link errors for the .text.fixup section
for any users of __put_user() in really large kernels:

net/batman-adv/batman-adv.o:(.text.fixup+0x4): relocation truncated to fit: 
R_ARM_JUMP24 against `.text.batadv_log_read'
...
drivers/scsi/sg.o:(.text.fixup+0x4): relocation truncated to fit: 
R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.text.fixup+0xc): relocation truncated to fit: 
R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.text.fixup+0x14): relocation truncated to fit: 
R_ARM_THM_JUMP24 against `.text.sg_ioctl'
...

This originates from

#define __put_user_asm(x, __pu_addr, err, instr)\
__asm__ __volatile__(   \
"1: " TUSER(instr) " %1, [%2], #0\n"\
"2:\n"  \
"   .pushsection .text.fixup,\"ax\"\n"  \
"   .align  2\n"\
"3: mov %0, %3\n"   \
"   b   2b\n"   \
"   .popsection\n"  \
"   .pushsection __ex_table,\"a\"\n"\
"   .align  3\n"\
"   .long   1b, 3b\n"   \
"   .popsection"\
: "+r" (err)\
: "r" (x), "r" (__pu_addr), "i" (-EFAULT)   \
: "cc")

Arnd

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 842f46af5b9d..b4fc91603429 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -362,6 +362,8 @@ archclean:
 # My testing targets (bypasses dependencies)
 bp:;   $(Q)$(MAKE) $(build)=$(boot) MACHINE=$(MACHINE) $(boot)/bootpImage
 
+KBUILD_CFLAGS  += -ffunction-sections -fdata-sections
+LDFLAGS_vmlinux+= --gc-sections
 
 define archhelp
   echo  '* zImage- Compressed kernel image (arch/$(ARCH)/boot/zImage)'
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index ad325a8c7e1e..f0eca9a96005 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -11,6 +11,9 @@ CFLAGS_REMOVE_insn.o = -pg
 CFLAGS_REMOVE_patch.o = -pg
 endif
 
+ccflags-y  += -fno-function-sections -fno-data-sections
+subdir-ccflags-y   += -fno-function-sections -fno-data-sections
+
 CFLAGS_REMOVE_return_address.o = -pg
 
 # Object file lists.
diff --git a/arch/arm/kernel/vmlinux-xip.lds.S 
b/arch/arm/kernel/vmlinux-xip.lds.S
index 56c8bdf776bd..ef7d8d7a997b 100644
--- a/arch/arm/kernel/vmlinux-xip.lds.S
+++ b/arch/arm/kernel/vmlinux-xip.lds.S
@@ -12,17 +12,17 @@
 #define PROC_INFO  \
. = ALIGN(4);   \
VMLINUX_SYMBOL(__proc_info_begin) = .;  \
-   *(.proc.info.init)  \
+   KEEP(*(.proc.info.init))\
VMLINUX_SYMBOL(__proc_info_end) = .;
 
 #define IDMAP_TEXT   

Re: [PATCH] fadump: Register the memory reserved by fadump

2016-08-05 Thread Mel Gorman
On Fri, Aug 05, 2016 at 07:25:03PM +1000, Michael Ellerman wrote:
> > One way to do that would be to walk through the different memory
> > reserved blocks and calculate the size. But Mel feels thats an
> > overhead (from his reply to the other thread) esp for just one use
> > case.
> 
> OK. I think you're referring to this:
> 
>   If fadump is reserving memory and alloc_large_system_hash(HASH_EARLY)
>   does not know about then then would an arch-specific callback for
>   arch_reserved_kernel_pages() be more appropriate?
>   ...
>   
>   That approach would limit the impact to ppc64 and would be less costly than
>   doing a memblock walk instead of using nr_kernel_pages for everyone else.
> 
> That sounds more robust to me than this solution.
> 

It would be the fastest with the least impact but not necessarily the
best. Ultimately that dma_reserve/memory_reserve is used for the sizing
calculation of the large system hashes but only the e820 map and fadump
is taken into account. That's a bit filthy even if it happens to work out ok.

Conceptually it would be cleaner, if expensive, to calculate the real
memblock reserves if HASH_EARLY and ditch the dma_reserve, memory_reserve
and nr_kernel_pages entirely. Unfortuantely, aside from the calculation,
there is a potential cost due to a smaller hash table that affects everyone,
not just ppc64. However, if the hash table is meant to be sized on the
number of available pages then it really should be based on that and not
just a made-up number.

-- 
Mel Gorman
SUSE Labs


Re: [PATCH v2] powerpc/32: fix csum_partial_copy_generic()

2016-08-05 Thread Michael Ellerman
Christophe Leroy  writes:

> Le 05/08/2016 à 08:57, Michael Ellerman a écrit :
>> Alessio Igor Bogani  writes:
>>> On 4 August 2016 at 05:53, Scott Wood  wrote:
 On Tue, 2016-08-02 at 10:07 +0200, Christophe Leroy wrote:
> commit 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic()
> based on copy_tofrom_user()") introduced a bug when destination
> address is odd and initial csum is not null
>
>
> The purpose of this patch was not to address Alessio's issue, but to fix 
> a huge issue on checksum calculation which induces breakdown of TCP 
> connections.
>
> I think it is worth commiting it upstream and on impacted stable 
> releases, allthought we don't have yet identified the issue Alessio's has.

OK. I'll put it back into fixes on Monday if I haven't heard anything
further.

cheers


Re: [PATCH 1/7] ima: on soft reboot, restore the measurement list

2016-08-05 Thread Petko Manolov
On 16-08-04 08:24:29, Mimi Zohar wrote:
> The TPM PCRs are only reset on a hard reboot.  In order to validate a
> TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list
> of the running kernel must be saved and restored on boot.  This patch
> restores the measurement list.
> 
> Changelog:
> - call ima_load_kexec_buffer() (Thiago)
> 
> Signed-off-by: Mimi Zohar 
> ---
>  security/integrity/ima/Makefile   |   1 +
>  security/integrity/ima/ima.h  |  10 ++
>  security/integrity/ima/ima_init.c |   2 +
>  security/integrity/ima/ima_kexec.c|  55 +++
>  security/integrity/ima/ima_queue.c|  10 ++
>  security/integrity/ima/ima_template.c | 171 
> ++
>  6 files changed, 249 insertions(+)
>  create mode 100644 security/integrity/ima/ima_kexec.c
> 
> diff --git a/security/integrity/ima/Makefile b/security/integrity/ima/Makefile
> index c34599f..c0ce7b1 100644
> --- a/security/integrity/ima/Makefile
> +++ b/security/integrity/ima/Makefile
> @@ -8,4 +8,5 @@ obj-$(CONFIG_IMA) += ima.o
>  ima-y := ima_fs.o ima_queue.o ima_init.o ima_main.o ima_crypto.o ima_api.o \
>ima_policy.o ima_template.o ima_template_lib.o ima_buffer.o
>  ima-$(CONFIG_IMA_APPRAISE) += ima_appraise.o
> +ima-$(CONFIG_KEXEC_FILE) += ima_kexec.o
>  obj-$(CONFIG_IMA_BLACKLIST_KEYRING) += ima_mok.o
> diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
> index b5728da..84e8d36 100644
> --- a/security/integrity/ima/ima.h
> +++ b/security/integrity/ima/ima.h
> @@ -102,6 +102,13 @@ struct ima_queue_entry {
>  };
>  extern struct list_head ima_measurements;/* list of all measurements */
>  
> +/* Some details preceding the binary serialized measurement list */
> +struct ima_kexec_hdr {
> + unsigned short version;
> + unsigned long buffer_size;
> + unsigned long count;
> +} __packed;

Unless there is no real need for this structure to be packed i suggest dropping 
the attribute.  When referenced through pointer 32bit ARM and MIPS (and likely 
all other 32bit RISC CPUs) use rather inefficient byte loads and stores.

Worse, if, for example, ->count is going to be read/written concurrently from 
multiple threads we get torn loads/stores thus losing atomicity of the access.


Petko


Re: [PATCH] fadump: Register the memory reserved by fadump

2016-08-05 Thread Michael Ellerman
Srikar Dronamraju  writes:

> * Michael Ellerman  [2016-08-05 17:07:01]:
>
>> Srikar Dronamraju  writes:
>> 
>> > Fadump kernel reserves large chunks of memory even before the pages are
>> > initialized. This could mean memory that corresponds to several nodes might
>> > fall in memblock reserved regions.
>> >
>> ...
>> > Register the memory reserved by fadump, so that the cache sizes are
>> > calculated based on the free memory (i.e Total memory - reserved
>> > memory).
>> 
>> The memory is reserved, with memblock_reserve(). Why is that not sufficient?
>
> Because at page initialization time, the kernel doesnt know how many
> pages are reserved.

The kernel does know, the fadump code that does the memblock reserve
runs before start_kernel(). AFAIK all calls to alloc_large_system_hash()
are after that.

So the problem seems to be just that alloc_large_system_hash() doesn't
know about reserved memory.

> One way to do that would be to walk through the different memory
> reserved blocks and calculate the size. But Mel feels thats an
> overhead (from his reply to the other thread) esp for just one use
> case.

OK. I think you're referring to this:

  If fadump is reserving memory and alloc_large_system_hash(HASH_EARLY)
  does not know about then then would an arch-specific callback for
  arch_reserved_kernel_pages() be more appropriate?
  ...
  
  That approach would limit the impact to ppc64 and would be less costly than
  doing a memblock walk instead of using nr_kernel_pages for everyone else.

That sounds more robust to me than this solution.

cheers


[pasemi] Radeon HD graphics card not recognised after the powerpc-4.8-1 commit

2016-08-05 Thread Christian Zigotzky

Hi Michael,

Thanks a million for your patch! :-)

@All
Keep your fingers crossed!

1. git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a


2. patch -p0 < powerpc-pci-Only-do-fixed-PHB-numbering-on-powernv.patch

3. patch -p0 < nemo_4.8-3.patch

4. yes "" | make oldconfig

5. I activated the following new BTRFS options for Srtest:

CONFIG_BTRFS_FS_POSIX_ACL=y
CONFIG_BTRFS_FS_CHECK_INTEGRITY=y
CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y
# CONFIG_BTRFS_DEBUG is not set
CONFIG_BTRFS_ASSERT=y

6. make vmlinux

...

Cheers,

Christian

On 05 August 2016 at 08:46 AM, Michael Ellerman wrote:

Christian Zigotzky  writes:


Hi All,

I figured out that the Git kernel (4.8) successfully detected my Radeon
HD6870 but Xorg can't access it.

The reason is, that the BusID has changed between the kernel 4.7 and 4.8.

This should fix it?

   https://patchwork.ozlabs.org/patch/656042/

mpe breaks pasemi again :{

cheers





[PATCH 09/11] powerpc/mpic: use of_property_read_bool

2016-08-05 Thread Julia Lawall
Use of_property_read_bool to check for the existence of a property.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// 
@@
expression e1,e2;
statement S2,S1;
@@
-   if (of_get_property(e1,e2,NULL))
+   if (of_property_read_bool(e1,e2))
S1 else S2
// 

Signed-off-by: Julia Lawall 

---
 arch/powerpc/sysdev/mpic.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
index 7de45b2..26d9c3f 100644
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@ -1249,7 +1249,7 @@ struct mpic * __init mpic_alloc(struct device_node *node,
/* Pick the physical address from the device tree if unspecified */
if (!phys_addr) {
/* Check if it is DCR-based */
-   if (of_get_property(node, "dcr-reg", NULL)) {
+   if (of_property_read_bool(node, "dcr-reg")) {
flags |= MPIC_USES_DCR;
} else {
struct resource r;



[PATCH 01/11] fsl/qe: use of_property_read_bool

2016-08-05 Thread Julia Lawall
Use of_property_read_bool to check for the existence of a property.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// 
@@
expression e1,e2;
statement S2,S1;
@@
-   if (of_get_property(e1,e2,NULL))
+   if (of_property_read_bool(e1,e2))
S1 else S2
// 

Signed-off-by: Julia Lawall 

---
 drivers/soc/fsl/qe/qe_tdm.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/qe/qe_tdm.c b/drivers/soc/fsl/qe/qe_tdm.c
index 5e48b14..0ac98e6 100644
--- a/drivers/soc/fsl/qe/qe_tdm.c
+++ b/drivers/soc/fsl/qe/qe_tdm.c
@@ -99,7 +99,7 @@ int ucc_of_parse_tdm(struct device_node *np, struct ucc_tdm 
*utdm,
utdm->tdm_port = val;
ut_info->uf_info.tdm_num = utdm->tdm_port;
 
-   if (of_get_property(np, "fsl,tdm-internal-loopback", NULL))
+   if (of_property_read_bool(np, "fsl,tdm-internal-loopback"))
utdm->tdm_mode = TDM_INTERNAL_LOOPBACK;
else
utdm->tdm_mode = TDM_NORMAL;



[PATCH 00/11] use of_property_read_bool

2016-08-05 Thread Julia Lawall
Use of_property_read_bool to check for the existence of a property.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// 
@@
expression e1,e2;
statement S2,S1;
@@
-   if (of_get_property(e1,e2,NULL))
+   if (of_property_read_bool(e1,e2))
S1 else S2
// 

---

 arch/powerpc/sysdev/mpic.c  |2 +-
 drivers/i2c/busses/i2c-mpc.c|2 +-
 drivers/mmc/host/sdhci-of-esdhc.c   |2 +-
 drivers/net/ethernet/freescale/xgmac_mdio.c |3 +--
 drivers/phy/phy-qcom-ufs.c  |2 +-
 drivers/pinctrl/nomadik/pinctrl-nomadik.c   |2 +-
 drivers/soc/fsl/qe/qe_tdm.c |2 +-
 drivers/soc/ti/knav_qmss_queue.c|2 +-
 drivers/tty/serial/atmel_serial.c   |8 
 drivers/usb/host/fsl-mph-dr-of.c|6 +++---
 sound/soc/codecs/ab8500-codec.c |   10 +-
 sound/soc/sh/rcar/ssi.c |2 +-
 sound/soc/soc-core.c|2 +-
 13 files changed, 22 insertions(+), 23 deletions(-)


Re: [PATCH V2 1/2] mm/page_alloc: Replace set_dma_reserve to set_memory_reserve

2016-08-05 Thread Vlastimil Babka

On 08/05/2016 09:24 AM, Srikar Dronamraju wrote:

* Vlastimil Babka  [2016-08-05 08:45:03]:


@@ -5493,10 +5493,10 @@ static void __paginginit free_area_init_core(struct 
pglist_data *pgdat)
}

/* Account for reserved pages */
-   if (j == 0 && freesize > dma_reserve) {
-   freesize -= dma_reserve;
+   if (j == 0 && freesize > nr_memory_reserve) {


Will this really work (together with patch 2) as intended?
This j == 0 means that we are doing this only for the first zone, which is
ZONE_DMA (or ZONE_DMA32) on node 0 on many systems. I.e. I don't think it's
really true that "dma_reserve has nothing to do with DMA or ZONE_DMA".

This zone will have limited amount of memory, so the "freesize >
nr_memory_reserve" will easily be false once you set this to many gigabytes,
so in fact nothing will get subtracted.

On the other hand if the kernel has both CONFIG_ZONE_DMA and
CONFIG_ZONE_DMA32 disabled, then j == 0 will be true for ZONE_NORMAL. This
zone might be present on multiple nodes (unless they are configured as
movable) and then the value intended to be global will be subtracted from
several nodes.

I don't know what's the exact ppc64 situation here, perhaps there are indeed
no DMA/DMA32 zones, and the fadump kernel only uses one node, so it works in
the end, but it doesn't seem much robust to me?



At the page initialization time, powerpc seems to have just one zone
spread across the 16 nodes.

From the dmesg.

[0.00] Memory hole size: 0MB
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x-0x1f5c8fff]
[0.00]   DMA32empty
[0.00]   Normal   empty
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x-0x01fb4fff]
[0.00]   node   1: [mem 0x01fb5000-0x03fa8fff]
[0.00]   node   2: [mem 0x03fa9000-0x05f9cfff]
[0.00]   node   3: [mem 0x05f9d000-0x07f8efff]
[0.00]   node   4: [mem 0x07f8f000-0x09f81fff]
[0.00]   node   5: [mem 0x09f82000-0x0bf77fff]
[0.00]   node   6: [mem 0x0bf78000-0x0df6dfff]
[0.00]   node   7: [mem 0x0df6e000-0x0ff63fff]
[0.00]   node   8: [mem 0x0ff64000-0x11f58fff]
[0.00]   node   9: [mem 0x11f59000-0x13644fff]
[0.00]   node  10: [mem 0x13645000-0x1563afff]
[0.00]   node  11: [mem 0x1563b000-0x17630fff]
[0.00]   node  12: [mem 0x17631000-0x19625fff]
[0.00]   node  13: [mem 0x19626000-0x1b5dcfff]
[0.00]   node  14: [mem 0x1b5dd000-0x1d5d2fff]
[0.00]   node  15: [mem 0x1d5d3000-0x1f5c8fff]


Hmm so it will work for ppc64 and its fadump, but I'm not happy that we 
made the function name sound like it's generic (unlike when the name 
contained "dma"), while it only works as intended in specific corner 
cases. The next user might be surprised...




Re: [v5.1] ucc_fast: Fix to avoid IS_ERR_VALUE abuses and dead code on 64bit systems.

2016-08-05 Thread arvind Yadav



On Friday 05 August 2016 02:01 AM, Arnd Bergmann wrote:

On Thursday, August 4, 2016 10:22:43 PM CEST Arvind Yadav wrote:

index df8ea79..ada9070 100644
--- a/include/soc/fsl/qe/ucc_fast.h
+++ b/include/soc/fsl/qe/ucc_fast.h
@@ -165,10 +165,12 @@ struct ucc_fast_private {
 int stopped_tx; /* Whether channel has been stopped for Tx
(STOP_TX, etc.) */
 int stopped_rx; /* Whether channel has been stopped for Rx */
-   u32 ucc_fast_tx_virtual_fifo_base_offset;/* pointer to base of Tx
-   virtual fifo */
-   u32 ucc_fast_rx_virtual_fifo_base_offset;/* pointer to base of Rx
-   virtual fifo */
+   unsigned long ucc_fast_tx_virtual_fifo_base_offset;/* pointer to base of
+   * Tx virtual fifo
+   */
+   unsigned long ucc_fast_rx_virtual_fifo_base_offset;/* pointer to base of
+   * Rx virtual fifo
+   */
  #ifdef STATISTICS
 u32 tx_frames;  /* Transmitted frames counter. */
 u32 rx_frames;  /* Received frames counter (only frames


This change seems ok, but what about the other u32 variables in ucc_geth.c
that get checked for IS_ERR_VALUE?

Arnd
I have send separate patch for ucc_geth ans ucc_slow.
-Arvind




Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Nicholas Piggin
On Thu, 4 Aug 2016 12:06:41 -0500
Segher Boessenkool  wrote:

> On Thu, Aug 04, 2016 at 06:10:57PM +0200, Arnd Bergmann wrote:
> > On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> >   
> > > + __used  \
> > > + __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), used)) \ 
> > >  
> > 
> > 
> > I've just started testing this, but the first problem I ran into
> > is that @ and # are special characters that have an architecture
> > specific meaning to the assembler. On ARM, you need "%note @" instead
> > of "@note #".  
> 
> That comment trick (I still feel guilty about it) causes more problems
> than it solves.  Please don't try to use it :-)

Yeah that's a funny hack. I don't think it's required though, but I'm just
running through some more tests.

I think I found an improvement with the thin archives as well -- we were
still building symbol table after removing the s option (that only avoids
index). "S" is required to not build symbol table.

I'll send out an RFC on a slightly more polished patch series shortly.

Thanks,
Nick


Re: Suspected regression?

2016-08-05 Thread Christophe Leroy



Le 19/07/2016 à 23:52, Scott Wood a écrit :

On Tue, 2016-07-19 at 12:00 +0200, Alessio Igor Bogani wrote:

Hi all,

I have got two boards MVME5100 (MPC7410 cpu) and MVME7100 (MPC8641D
cpu) for which I use the same cross-compiler (ppc7400).

I tested these against kernel HEAD to found that these don't boot
anymore (PID 1 crash).

Bisecting results in first offending commit:
7aef4136566b0539a1a98391181e188905e33401

Removing it from HEAD make boards boot properly again.

A third system based on P2010 isn't affected at all.

Is it a regression or I have made something wrong?


I booted both my next branch, and Linus's master on MPC8641HPCN and didn't see
this -- though possibly your RFS is doing something different.  Maybe that's
the difference with P2010 as well.

Is there any way you can debug the cause of the crash?  Or send me a minimal
RFS that demonstrates the problem (ideally with debug symbols on the userspace
binaries)?



I got from Alessio the below information:

systemd[1]: Caught , core dump failed (child 137, code=killed,
status=7/BUS).
systemd[1]: Freezing execution.


What can generate SIGBUS ?
And shouldn't we also get some KERN_ERR trace, something like "unhandled 
signal 7 at ." ?


Christophe


Re: [PATCH] cpufreq: powernv: Fix crash in gpstate_timer_handler

2016-08-05 Thread Andrew Donnellan

On 05/08/16 01:29, Akshay Adiga wrote:

'commit 09ca4c9b5958 ("cpufreq: powernv: Replacing pstate_id with
frequency table index")' changes calc_global_pstate() to use
cpufreq_table index instead of pstate_id.

But in gpstate_timer_handler() pstate_id was being passed instead
of cpufreq_table index, which caused the index_to_pstate() to access
out of bound indices, leading to this crash.

Adding sanity check for index and pstate, to ensure only valid pstate
and index values are returned.

Call Trace:
[c0078d66b130] [c011d224] __free_irq+0x234/0x360
(unreliable)
[c0078d66b1c0] [c011d44c] free_irq+0x6c/0xa0
[c0078d66b1f0] [c006c4f8] opal_event_shutdown+0x88/0xd0
[c0078d66b230] [c0067a4c] opal_shutdown+0x1c/0x90
[c0078d66b260] [c0063a00] pnv_shutdown+0x20/0x40
[c0078d66b280] [c0021538] machine_restart+0x38/0x90
[c78d66b310] [c0965ea0] panic+0x284/0x300
[c0078d66b3a0] [c001f508] die+0x388/0x450
[c0078d66b430] [c0045a50] bad_page_fault+0xd0/0x140
[c0078d66b4a0] [c0008964] handle_page_fault+0x2c/0x30
   interrupt: 300 at gpstate_timer_handler+0x150/0x260
LR = gpstate_timer_handler+0x130/0x260
[c0078d66b7f0] [c0132b58] call_timer_fn+0x58/0x1c0
[c0078d66b880] [c0132e20] expire_timers+0x130/0x1d0
[c0078d66b8f0] [c0133068] run_timer_softirq+0x1a8/0x230
[c0078d66b980] [c00b535c] __do_softirq+0x18c/0x400
[c0078d66ba70] [c00b5828] irq_exit+0xc8/0x100
[c0078d66ba90] [c001e214] timer_interrupt+0xa4/0xe0
[c0078d66bac0] [c00027d0] decrementer_common+0x150/0x180
   interrupt: 901 at arch_local_irq_restore+0x74/0x90
  0] [c0106b34] call_cpuidle+0x44/0x90
[c0078d66be50] [c010708c] cpu_startup_entry+0x38c/0x460
[c0078d66bf20] [c003d930] start_secondary+0x330/0x380
[c0078d66bf90] [c0008e6c] start_secondary_prolog+0x10/0x14

Fixes: 08d27eb ("cpufreq: powernv: Replacing pstate_id with
frequency table index")
Reported-by: Madhavan Srinivasan 
Signed-off-by: Akshay Adiga 


Tested-by: Andrew Donnellan 

--
Andrew Donnellan  OzLabs, ADL Canberra
andrew.donnel...@au1.ibm.com  IBM Australia Limited



Re: [PATCH V2 1/2] mm/page_alloc: Replace set_dma_reserve to set_memory_reserve

2016-08-05 Thread Srikar Dronamraju
* Mel Gorman  [2016-08-05 07:47:47]:

> On Thu, Aug 04, 2016 at 10:42:08PM +0530, Srikar Dronamraju wrote:
> > Expand the scope of the existing dma_reserve to accommodate other memory
> > reserves too. Accordingly rename variable dma_reserve to
> > nr_memory_reserve.
> > 
> > set_memory_reserve also takes a new parameter that helps to identify if
> > the current value needs to be incremented.
> > 
> 
> I think the parameter is ugly and it should have been just
> inc_memory_reserve but at least it works.
> 

Yes while the parameter is definitely ugly, the only other use
case in arch/x86/kernel/e820.c seems to be written with an intention to
set to an absolute value.

It was "set_dma_reserve(nr_pages - nr_free_pages)". Both of them
nr_pages and nr_free_pages are calculated after walking through the mem
blocks. I didnt want to take a chance where someother code path also
starts to set reserve value and then the code in e820.c just increments
it.

However if you still feel strongly about using inc_memory_reserve than
set_memory_reserve, I will respin.


-- 
Thanks and Regards
Srikar Dronamraju



Re: [PATCH] fadump: Register the memory reserved by fadump

2016-08-05 Thread Srikar Dronamraju
* Michael Ellerman  [2016-08-05 17:07:01]:

> Srikar Dronamraju  writes:
> 
> > Fadump kernel reserves large chunks of memory even before the pages are
> > initialized. This could mean memory that corresponds to several nodes might
> > fall in memblock reserved regions.
> >
> ...
> > Register the memory reserved by fadump, so that the cache sizes are
> > calculated based on the free memory (i.e Total memory - reserved
> > memory).
> 
> The memory is reserved, with memblock_reserve(). Why is that not sufficient?
> 
> cheers
> 

Because at page initialization time, the kernel doesnt know how many
pages are reserved. One way to do that would be to walk through the
different memory reserved blocks and calculate the size. But Mel feels
thats an overhead (from his reply to the other thread) esp for just one
use case.

-- 
Thanks and Regards
Srikar Dronamraju



Re: [PATCH v2] powerpc/32: fix csum_partial_copy_generic()

2016-08-05 Thread Christophe Leroy



Le 05/08/2016 à 08:57, Michael Ellerman a écrit :

Alessio Igor Bogani  writes:

On 4 August 2016 at 05:53, Scott Wood  wrote:

On Tue, 2016-08-02 at 10:07 +0200, Christophe Leroy wrote:

commit 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic()
based on copy_tofrom_user()") introduced a bug when destination
address is odd and initial csum is not null

In that (rare) case the initial csum value has to be rotated one byte
as well as the resulting value is

This patch also fixes related comments

Fixes: 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic()
based on copy_tofrom_user()")
Cc: sta...@vger.kernel.org

Signed-off-by: Christophe Leroy 
---
  v2: updated comments as suggested by Segher

  arch/powerpc/lib/checksum_32.S | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

Alessio, can you confirm whether this fixes the problem you reported?

No unfortunately.

Thanks for testing.

I've dropped the patch for now, send me a new one that works.




The purpose of this patch was not to address Alessio's issue, but to fix 
a huge issue on checksum calculation which induces breakdown of TCP 
connections.


I think it is worth commiting it upstream and on impacted stable 
releases, allthought we don't have yet identified the issue Alessio's has.


Christophe


Re: [PATCH V2 1/2] mm/page_alloc: Replace set_dma_reserve to set_memory_reserve

2016-08-05 Thread Srikar Dronamraju
* Vlastimil Babka  [2016-08-05 08:45:03]:

> >@@ -5493,10 +5493,10 @@ static void __paginginit free_area_init_core(struct 
> >pglist_data *pgdat)
> > }
> >
> > /* Account for reserved pages */
> >-if (j == 0 && freesize > dma_reserve) {
> >-freesize -= dma_reserve;
> >+if (j == 0 && freesize > nr_memory_reserve) {
> 
> Will this really work (together with patch 2) as intended?
> This j == 0 means that we are doing this only for the first zone, which is
> ZONE_DMA (or ZONE_DMA32) on node 0 on many systems. I.e. I don't think it's
> really true that "dma_reserve has nothing to do with DMA or ZONE_DMA".
> 
> This zone will have limited amount of memory, so the "freesize >
> nr_memory_reserve" will easily be false once you set this to many gigabytes,
> so in fact nothing will get subtracted.
> 
> On the other hand if the kernel has both CONFIG_ZONE_DMA and
> CONFIG_ZONE_DMA32 disabled, then j == 0 will be true for ZONE_NORMAL. This
> zone might be present on multiple nodes (unless they are configured as
> movable) and then the value intended to be global will be subtracted from
> several nodes.
> 
> I don't know what's the exact ppc64 situation here, perhaps there are indeed
> no DMA/DMA32 zones, and the fadump kernel only uses one node, so it works in
> the end, but it doesn't seem much robust to me?
> 

At the page initialization time, powerpc seems to have just one zone
spread across the 16 nodes.

>From the dmesg.

[0.00] Memory hole size: 0MB
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x-0x1f5c8fff]
[0.00]   DMA32empty
[0.00]   Normal   empty
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x-0x01fb4fff]
[0.00]   node   1: [mem 0x01fb5000-0x03fa8fff]
[0.00]   node   2: [mem 0x03fa9000-0x05f9cfff]
[0.00]   node   3: [mem 0x05f9d000-0x07f8efff]
[0.00]   node   4: [mem 0x07f8f000-0x09f81fff]
[0.00]   node   5: [mem 0x09f82000-0x0bf77fff]
[0.00]   node   6: [mem 0x0bf78000-0x0df6dfff]
[0.00]   node   7: [mem 0x0df6e000-0x0ff63fff]
[0.00]   node   8: [mem 0x0ff64000-0x11f58fff]
[0.00]   node   9: [mem 0x11f59000-0x13644fff]
[0.00]   node  10: [mem 0x13645000-0x1563afff]
[0.00]   node  11: [mem 0x1563b000-0x17630fff]
[0.00]   node  12: [mem 0x17631000-0x19625fff]
[0.00]   node  13: [mem 0x19626000-0x1b5dcfff]
[0.00]   node  14: [mem 0x1b5dd000-0x1d5d2fff]
[0.00]   node  15: [mem 0x1d5d3000-0x1f5c8fff]


The config has the below.

CONFIG_ZONE_DMA32=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_FORCE_MAX_ZONEORDER=9
CONFIG_ZONE_DMA=y

I tried forcing CONFIG_ZONE_DMA to be not set, but make always pick it.
>From source arch/powerpc/Kconfig marks CONFIG_ZONE_DMA as "default y"

-- 
Thanks and Regards
Srikar Dronamraju



Re: [Patch v3 01/11] arch/powerpc/pci: Fix compiling error for mpc85xx_edac

2016-08-05 Thread Johannes Thumshirn
On Fri, Aug 05, 2016 at 09:01:26AM +0200, Borislav Petkov wrote:
> On Fri, Aug 05, 2016 at 04:26:26AM +, york sun wrote:
> > I don't have deep knowledge of this driver. What I am trying is to 
> > separate the common DDR part and share it with ARM platforms. Along the 
> > way, I found the compiling error if build a module. If exposing these 
> > functions becomes a concern, I can live without it.
> 
> Perhaps you or Johannes could fix this properly to use pci_get_device()
> as the rest of the EDAC drivers do, instead of exporting core PCI
> functions...

I can give it a shot, but I don't have too much spare time atm and no hardware
to test, so it'll have a strong RFC smell attached to it.

Byte,
Johannes

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


RE: [PATCH] cxl: Use fixed width predefined types in data structure.

2016-08-05 Thread Michael Ellerman
David Laight  writes:

> From: Philippe Bergheaud
>> Sent: 04 August 2016 14:56
>> This patch fixes a regression introduced by commit b810253.
>> It substitutes the type __u8 to u8 in the uapi header cxl.h,
>> because the latter is not always defined in userland build
>> environments, in particular when cross-compiling libcxl on
>> x86_64 linux machines (RHEL6.7 and Ubuntu 16.04).
>> 
>> It also makes the definition of cxl_event_afu_driver_reserved
>> more consistent with the other definitions in the header file.
> ...
>> diff --git a/include/uapi/misc/cxl.h b/include/uapi/misc/cxl.h
>> index cbae529..180d526 100644
>> --- a/include/uapi/misc/cxl.h
>> +++ b/include/uapi/misc/cxl.h
>> @@ -136,8 +136,8 @@ struct cxl_event_afu_driver_reserved {
>>   *
>>   * Of course the contents will be ABI, but that's up the AFU driver.
>>   */
>> -size_t data_size;
>> -u8 data[];
>> +__u32 data_size;
>> +__u8 data[];
>>  };
>
> You've just changed 'data_size' from 64bit to 32bit (on 64bit systems).
> This isn't mentioned in the commit message and changes the API.

Yeah that's pretty fishy.

In practice I suspect there aren't thousands of users of that API, the
commit is only a month old, so we can probably still change it. But
please call it out in the change log.

cheers


Re: [PATCH] fadump: Register the memory reserved by fadump

2016-08-05 Thread Michael Ellerman
Srikar Dronamraju  writes:

> Fadump kernel reserves large chunks of memory even before the pages are
> initialized. This could mean memory that corresponds to several nodes might
> fall in memblock reserved regions.
>
...
> Register the memory reserved by fadump, so that the cache sizes are
> calculated based on the free memory (i.e Total memory - reserved
> memory).

The memory is reserved, with memblock_reserve(). Why is that not sufficient?

cheers


Re: [Patch v3 01/11] arch/powerpc/pci: Fix compiling error for mpc85xx_edac

2016-08-05 Thread Borislav Petkov
On Fri, Aug 05, 2016 at 04:26:26AM +, york sun wrote:
> I don't have deep knowledge of this driver. What I am trying is to 
> separate the common DDR part and share it with ARM platforms. Along the 
> way, I found the compiling error if build a module. If exposing these 
> functions becomes a concern, I can live without it.

Perhaps you or Johannes could fix this properly to use pci_get_device()
as the rest of the EDAC drivers do, instead of exporting core PCI
functions...

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--


Re: [PATCH kernel 05/15] powerpc/iommu: Stop using @current in mm_iommu_xxx

2016-08-05 Thread Michael Ellerman
Alexey Kardashevskiy  writes:

> In some situations the userspace memory context may live longer than
> the userspace process itself so if we need to do proper memory context
> cleanup, we better cache @mm and use it later when the process is gone
> (@current or @current->mm are NULL).
>
> This changes mm_iommu_xxx API to receive mm_struct instead of using one
> from @current.
>
> This is needed by the following patch to do proper cleanup in time.
> This depends on "powerpc/powernv/ioda: Fix endianness when reading TCEs"
> to do proper cleanup via tce_iommu_clear() patch.
>
> To keep API consistent, this replaces mm_context_t with mm_struct;
> we stick to mm_struct as mm_iommu_adjust_locked_vm() helper needs
> access to &mm->mmap_sem.
>
> This should cause no behavioral change.

Is this a theoretical bug, or do we hit it in practice?

In other words, should I merge this as a fix for 4.8, or can it wait for
4.9 with the rest of the series?

> Signed-off-by: Alexey Kardashevskiy 
> ---
>  arch/powerpc/include/asm/mmu_context.h | 20 +++--
>  arch/powerpc/kernel/setup-common.c |  2 +-
>  arch/powerpc/mm/mmu_context_book3s64.c |  4 +--
>  arch/powerpc/mm/mmu_context_iommu.c| 54 
> ++

>  drivers/vfio/vfio_iommu_spapr_tce.c| 41 --

I'd need an ACK from Alex for that part.

cheers


Re: [Patch v3 01/11] arch/powerpc/pci: Fix compiling error for mpc85xx_edac

2016-08-05 Thread Borislav Petkov
On Thu, Aug 04, 2016 at 11:39:14PM +, york sun wrote:
> I will rename it if I respin this patch for any reason. Otherwise, I 
> will send out another patch to rename it after merging.

Feel free to send an updated one as a reply to this thread.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--