Re: [PATCH] powerpc/vdso: Fix incorrect CFI in gettimeofday.S

2022-05-18 Thread Alan Modra
On Tue, May 17, 2022 at 10:32:09PM +1000, Michael Ellerman wrote:
> "Naveen N. Rao"  writes:
> > Michael Ellerman wrote:
> >>
> >> diff --git a/arch/powerpc/kernel/vdso/gettimeofday.S 
> >> b/arch/powerpc/kernel/vdso/gettimeofday.S
> >> index eb9c81e1c218..0aee255e9cbb 100644
> >> --- a/arch/powerpc/kernel/vdso/gettimeofday.S
> >> +++ b/arch/powerpc/kernel/vdso/gettimeofday.S
> >> @@ -22,12 +22,15 @@
> >>  .macro cvdso_call funct call_time=0
> >>.cfi_startproc
> >>PPC_STLUr1, -PPC_MIN_STKFRM(r1)
> >> +  .cfi_adjust_cfa_offset PPC_MIN_STKFRM
> >>mflrr0
> >> -  .cfi_register lr, r0
> >>PPC_STLUr1, -PPC_MIN_STKFRM(r1)
> >> +  .cfi_adjust_cfa_offset PPC_MIN_STKFRM
> >>PPC_STL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1)
> >
> > 
> >
> >> @@ -46,6 +50,7 @@
> >>mtlrr0
> >>.cfi_restore lr
> >>addir1, r1, 2 * PPC_MIN_STKFRM
> >> +  .cfi_def_cfa_offset 0
> >
> > Should this be .cfi_adjust_cfa_offset, given that we used that at the
> > start of the function?
>  
> AIUI "adjust x" is offset += x, whereas "def x" is offset = x.

Yes.

> So we could use adjust here, but we'd need to adjust by -(2 * PPC_MIN_STKFRM).

Yes.

> It seemed clearer to just set the offset back to 0, which is what it is
> at the start of the function.

Yes.  In detail, both .cfi_def_cfa_offset and .cfi_adjust_cfa_offset
are interpreteted by the assembler into DW_CFA_def_cfa_offset byte
codes, so you should get the same .eh_frame contents if using Naveen's
suggestion.  It boils down to style really, and the most common style
is to use ".cfi_def_cfa_offset 0" here.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] powerpc/vdso: Fix incorrect CFI in gettimeofday.S

2022-05-02 Thread Alan Modra
On Mon, May 02, 2022 at 09:27:05AM -0500, Segher Boessenkool wrote:
> >   2) If a function changes LR or any non-volatile register, the save
> >  location for those regs must be given. The cfi can be at any
> >  instruction after the saves up to the point that the reg is
> >  changed. (Exception: LR save should be described before a bl.)
> 
> That isn't an exception?  bl changes the current LR after all :-)

The point is that in other cases the cfi can be as late as the
instruction that changes the reg.  For calls it must be at least one
instruction before the call.

Also, I'll note for the wider audience that delaying cfi is slightly
better than playing it safe as Michael has done in his patch in
describing the saves right at the save instruction.  Register save cfi
can usually be grouped together, resulting in fewer CFI_advance codes
in .eh_frame.

> Alan proposed a larger patch that changed to a single stack frame, but it 
> needs changes to
> take into account the red zone.

Yes, now that you mention it, I see the obvious error in the patch I
wrote.  I did say it was untested!


-- 
Alan Modra
Australia Development Lab, IBM


Re: PowerPC64 future proof kernel toc, revised for lld

2021-03-10 Thread Alan Modra
On Wed, Mar 10, 2021 at 01:44:57PM +0100, Christophe Leroy wrote:
> 
> 
> Le 10/03/2021 à 13:25, Alan Modra a écrit :
> > On Wed, Mar 10, 2021 at 08:33:37PM +1100, Alexey Kardashevskiy wrote:
> > > One more question - the older version had a construct "DEFINED (.TOC.) ?
> > > .TOC. : ..." in case .TOC. is not defined (too old ld? too old gcc?) but 
> > > the
> > > newer patch seems assuming it is always defined, when was it added? I have
> > > the same check in SLOF, for example, do I still need it?
> > 
> > .TOC. symbol support was first added 2012-11-06, so you need
> > binutils-2.24 or later to use .TOC. as a symbol.
> > 
> 
> As of today, minimum requirement to build kernel is binutils 2.23, see 
> https://www.kernel.org/doc/html/latest/process/changes.html#current-minimal-requirements

Yes, and arch/powerpc/Makefile complains about 2.24.  So for powerpc
that means you need to go to at least 2.25.  Oh the horror of needing
such new tools!

-- 
Alan Modra
Australia Development Lab, IBM


Re: PowerPC64 future proof kernel toc, revised for lld

2021-03-10 Thread Alan Modra
On Wed, Mar 10, 2021 at 08:33:37PM +1100, Alexey Kardashevskiy wrote:
> One more question - the older version had a construct "DEFINED (.TOC.) ?
> .TOC. : ..." in case .TOC. is not defined (too old ld? too old gcc?) but the
> newer patch seems assuming it is always defined, when was it added? I have
> the same check in SLOF, for example, do I still need it?

.TOC. symbol support was first added 2012-11-06, so you need
binutils-2.24 or later to use .TOC. as a symbol.

-- 
Alan Modra
Australia Development Lab, IBM


Re: PowerPC64 future proof kernel toc, revised for lld

2021-03-09 Thread Alan Modra
On Wed, Mar 10, 2021 at 03:44:44PM +1100, Alexey Kardashevskiy wrote:
> For my own education, is .got for prom_init.o still generated by ld or gcc?

.got is generated by ld.

> In other words, should "objdump -D -s -j .got" ever dump .got for any .o
> file, like below?

No.  "objdump -r prom_init.o | grep GOT" will tell you whether
prom_init.o *may* cause ld to generate .got entries.  (Linker
optimisations or --gc-sections might remove the need for those .got
entries.)

> objdump: section '.got' mentioned in a -j option, but not found in any input
> file

Right, expected.

-- 
Alan Modra
Australia Development Lab, IBM


Re: PowerPC64 future proof kernel toc, revised for lld

2021-03-09 Thread Alan Modra
This patch future-proofs the kernel against linker changes that might
put the toc pointer at some location other than .got+0x8000, by
replacing __toc_start+0x8000 with .TOC. throughout.  If the kernel's
idea of the toc pointer doesn't agree with the linker, bad things
happen.

prom_init.c code relocating its toc is also changed so that a symbolic
__prom_init_toc_start toc-pointer relative address is calculated
rather than assuming that it is always at toc-pointer - 0x8000.  The
length calculations loading values from the toc are also avoided.
It's a little incestuous to do that with unreloc_toc picking up
adjusted values (which is fine in practice, they both adjust by the
same amount if all goes well).

I've also changed the way .got is aligned in vmlinux.lds and
zImage.lds, mostly so that dumping out section info by objdump or
readelf plainly shows the alignment is 256.  This linker script
feature was added 2005-09-27, available in FSF binutils releases from
2.17 onwards.  Should be safe to use in the kernel, I think.

Finally, put *(.got) before the prom_init.o entry which only needs
*(.toc), so that the GOT header goes in the correct place.  I don't
believe this makes any difference for the kernel as it would for
dynamic objects being loaded by ld.so.  That change is just to stop
lusers who blindly copy kernel scripts being led astray.  Of course,
this change needs the prom_init.c changes.

Some notes on .toc and .got.

.toc is a compiler generated section of addresses.  .got is a linker
generated section of addresses, generally built when the linker sees
R_*_*GOT* relocations.  In the case of powerpc64 ld.bfd, there are
multiple generated .got sections, one per input object file.  So you
can somewhat reasonably write in a linker script an input section
statement like *prom_init.o(.got .toc) to mean "the .got and .toc
section for files matching *prom_init.o".  On other architectures that
doesn't make sense, because the linker generally has just one .got
section.  Even on powerpc64, note well that the GOT entries for
prom_init.o may be merged with GOT entries from other objects.  That
means that if prom_init.o references, say, _end via some GOT
relocation, and some other object also references _end via a GOT
relocation, the GOT entry for _end may be in the range
__prom_init_toc_start to __prom_init_toc_end and if the kernel does
something special to GOT/TOC entries in that range then the value of
_end as seen by objects other than prom_init.o will be affected.  On
the other hand the GOT entry for _end may not be in the range
__prom_init_toc_start to __prom_init_toc_end.  Which way it turns out
is deterministic but a detail of linker operation that should not be
relied on.

A feature of ld.bfd is that input .toc (and .got) sections matching
one linker input section statement may be sorted, to put entries used
by small-model code first, near the toc base.  This is why scripts for
powerpc64 normally use *(.got .toc) rather than *(.got) *(.toc), since
the first form allows more freedom to sort.

Another feature of ld.bfd is that indirect addressing sequences using
the GOT/TOC may be edited by the linker to relative addressing.  In
many cases relative addressing would be emitted by gcc for
-mcmodel=medium if you appropriately decorate variable declarations
with non-default visibility.

Signed-off-by: Alan Modra 

diff --git a/arch/powerpc/boot/crt0.S b/arch/powerpc/boot/crt0.S
index 1d83966f5ef6..e45907fe468f 100644
--- a/arch/powerpc/boot/crt0.S
+++ b/arch/powerpc/boot/crt0.S
@@ -28,7 +28,7 @@ p_etext:  .8byte  _etext
 p_bss_start:   .8byte  __bss_start
 p_end: .8byte  _end
 
-p_toc: .8byte  __toc_start + 0x8000 - p_base
+p_toc: .8byte  .TOC. - p_base
 p_dyn: .8byte  __dynamic_start - p_base
 p_rela:.8byte  __rela_dyn_start - p_base
 p_prom:.8byte  0
diff --git a/arch/powerpc/boot/zImage.lds.S b/arch/powerpc/boot/zImage.lds.S
index d6f072865627..d65cd55a6f38 100644
--- a/arch/powerpc/boot/zImage.lds.S
+++ b/arch/powerpc/boot/zImage.lds.S
@@ -36,12 +36,9 @@ SECTIONS
   }
 
 #ifdef CONFIG_PPC64_BOOT_WRAPPER
-  . = ALIGN(256);
-  .got :
+  .got : ALIGN(256)
   {
-__toc_start = .;
-*(.got)
-*(.toc)
+*(.got .toc)
   }
 #endif
 
diff --git a/arch/powerpc/include/asm/sections.h 
b/arch/powerpc/include/asm/sections.h
index 324d7b298ec3..e5a1eae11ed5 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -48,14 +48,18 @@ static inline int in_kernel_text(unsigned long addr)
 
 static inline unsigned long kernel_toc_addr(void)
 {
-   /* Defined by the linker, see vmlinux.lds.S */
-   extern unsigned long __toc_start;
-
-   /*
-* The TOC register (r2) points 32kB into the TOC, so that 64kB of
-* the TOC can be addressed using a single machine instruction.
-*/
-   return (unsigned long)(&__toc_start) + 0x8000UL;
+#if 0
+   /* This version 

PowerPC64 future proof kernel toc, revised

2021-03-08 Thread Alan Modra
 * The TOC register (r2) points 32kB into the TOC, so that 64kB of
 * the TOC can be addressed using a single machine instruction.
 */
-   return (unsigned long)(&__toc_start) + 0x8000UL;
+   return (unsigned long)&__toc_ptr;
 }
 
 static inline int overlaps_interrupt_vector_text(unsigned long start,
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index ece7f97bafff..1cae5b0943be 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -899,7 +899,7 @@ _GLOBAL(relative_toc)
blr
 
 .balign 8
-p_toc: .8byte  __toc_start + 0x8000 - 0b
+p_toc: .8byte  __toc_ptr - 0b
 
 /*
  * This is where the main kernel code starts.
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index ccf77b985c8f..d309a7787652 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -3220,27 +3220,26 @@ static void unreloc_toc(void)
 {
 }
 #else
-static void __reloc_toc(unsigned long offset, unsigned long nr_entries)
+static void __reloc_toc(unsigned long offset)
 {
-   unsigned long i;
unsigned long *toc_entry;
+   unsigned long *toc_start, *toc_end;
 
-   /* Get the start of the TOC by using r2 directly. */
-   asm volatile("addi %0,2,-0x8000" : "=b" (toc_entry));
+   asm("addis %0,2,__prom_init_toc_start@toc@ha\n\t"
+   "addi %0,%0,__prom_init_toc_start@toc@l" : "=b" (toc_start));
+   asm("addis %0,2,__prom_init_toc_end@toc@ha\n\t"
+   "addi %0,%0,__prom_init_toc_end@toc@l" : "=b" (toc_end));
 
-   for (i = 0; i < nr_entries; i++) {
-   *toc_entry = *toc_entry + offset;
-   toc_entry++;
+   for (toc_entry = toc_start; toc_entry != toc_end; toc_entry++) {
+   *toc_entry += offset;
}
 }
 
 static void reloc_toc(void)
 {
unsigned long offset = reloc_offset();
-   unsigned long nr_entries =
-   (__prom_init_toc_end - __prom_init_toc_start) / sizeof(long);
 
-   __reloc_toc(offset, nr_entries);
+   __reloc_toc(offset);
 
mb();
 }
@@ -3248,12 +3247,10 @@ static void reloc_toc(void)
 static void unreloc_toc(void)
 {
unsigned long offset = reloc_offset();
-   unsigned long nr_entries =
-   (__prom_init_toc_end - __prom_init_toc_start) / sizeof(long);
 
mb();
 
-   __reloc_toc(-offset, nr_entries);
+   __reloc_toc(-offset);
 }
 #endif
 #endif
diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index 72fa3c00229a..343579080dd5 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -326,17 +326,16 @@ SECTIONS
__end_opd = .;
}
 
-   . = ALIGN(256);
-   .got : AT(ADDR(.got) - LOAD_OFFSET) {
-   __toc_start = .;
+   .got : AT(ADDR(.got) - LOAD_OFFSET) ALIGN(256) {
+   *(.got)
 #ifndef CONFIG_RELOCATABLE
__prom_init_toc_start = .;
-   arch/powerpc/kernel/prom_init.o*(.toc .got)
+   arch/powerpc/kernel/prom_init.o*(.toc)
__prom_init_toc_end = .;
 #endif
-   *(.got)
*(.toc)
}
+   __toc_ptr = DEFINED (.TOC.) ? .TOC. : ADDR (.got) + 0x8000;
 #endif
 
/* The initial task and kernel stack */

-- 
Alan Modra
Australia Development Lab, IBM


Re: Error: invalid switch -me200

2020-11-16 Thread Alan Modra
On Fri, Nov 13, 2020 at 06:50:15PM -0600, Segher Boessenkool wrote:
> On Fri, Nov 13, 2020 at 04:37:38PM -0800, Fāng-ruì Sòng wrote:
> > On Fri, Nov 13, 2020 at 4:23 PM Segher Boessenkool
> >  wrote:
> > > On Fri, Nov 13, 2020 at 12:14:18PM -0800, Nick Desaulniers wrote:
> > > > > > > Error: invalid switch -me200
> > > > > > > Error: unrecognized option -me200
> > > > > >
> > > > > > 251 cpu-as-$(CONFIG_E200)   += -Wa,-me200
> > > > > >
> > > > > > Are those all broken configs, or is Kconfig messed up such that
> > > > > > randconfig can select these when it should not?
> > > > >
> > > > > Hmmm, looks like this flag does not exist in mainline binutils? There 
> > > > > is
> > > > > a thread in 2010 about this that Segher commented on:
> > > > >
> > > > > https://lore.kernel.org/linuxppc-dev/9859e645-954d-4d07-8003-ffcd2391a...@kernel.crashing.org/
> > > > >
> > > > > Guess this config should be eliminated?
> > >
> > > The help text for this config options says that e200 is used in 55xx,
> > > and there *is* an -me5500 GAS flag (which probably does this same
> > > thing, too).  But is any of this tested, or useful, or wanted?
> > >
> > > Maybe Christophe knows, cc:ed.
> > 
> > CC Alan Modra, a binutils global maintainer.
> > 
> > Alan, can the few -Wa,-m* options deleted from arch/powerpc/Makefile ?
> 
> All the others work fine (and are needed afaics), it is only -me200 that
> doesn't exist (in mainline binutils).

Right, and a quick check says it never existed.  There is e200z4,
added to binutils with dfdaec14b0d, 2016-08-01, but the kernel -me200
was added in 2005.  I suspect the toolchain support only existed
inside Freescale and pushing it upstream was too difficult.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RFC PATCH] powerpc/64/signal: balance return predictor stack in signal trampoline

2020-04-17 Thread Alan Modra
On Fri, Apr 17, 2020 at 07:17:47PM +1000, Nicholas Piggin wrote:
> I don't know much about dwarf, gdb still seems to recognize the signal
> frame and unwind properly if I break inside a signal handler.

Yes, the dwarf unwind info still looks good.  The commented out dwarf
near the end of sigtramp.S should probably go.  At least if you really
can't take an async signal in the trampoline (a kernel question, not
anything to do with gcc support of async signals as the comment
wrongly says).  If you *can* take an async signal at some point past
the trampoline addi, then delete the comment and uncomment the code.
Note that the advance_loc there bitrotted ever since the nop was added
before the trampoline, so you'd need to change that to an advance_loc
that moves from .Lsigrt_start to immediately after the addi, ie. 0x42.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] powerpc/boot: Delete unneeded .globl _zimage_start

2020-03-25 Thread Alan Modra
On Wed, Mar 25, 2020 at 05:22:31AM +, Joel Stanley wrote:
> On Wed, 25 Mar 2020 at 05:19, Fangrui Song  wrote:
> >
> > .globl sets the symbol binding to STB_GLOBAL while .weak sets the
> > binding to STB_WEAK. They should not be used together. It is accidetal
> > rather then intentional that GNU as let .weak override .globl while
> > clang integrated assembler let the last win.

No, it isn't accidental.  gas deliberately lets .weak override .globl.
Since 1996-07-26, git commit 5ca547dc239

I'm fine with the patch so far as it is true that there is no need for
both .globl and .weak (and it looks silly to have both), but the
explanation isn't true.  The patch is needed because the clang
assembler is incompatible with gas in this detail.

> > Fixes: cd197ffcf10b "[POWERPC] zImage: Cleanup and improve zImage entry 
> > point"
> > Fixes: ee9d21b3b358 "powerpc/boot: Ensure _zimage_start is a weak symbol"
> > Link: https://github.com/ClangBuiltLinux/linux/issues/937
> > Signed-off-by: Fangrui Song 
> > Cc: Joel Stanley 
> > Cc: Michael Ellerman 
> > Cc: Nick Desaulniers 
> > Cc: clang-built-li...@googlegroups.com
> > ---
> >  arch/powerpc/boot/crt0.S | 3 ---
> >  1 file changed, 3 deletions(-)
> >
> > diff --git a/arch/powerpc/boot/crt0.S b/arch/powerpc/boot/crt0.S
> > index 92608f34d312..1d83966f5ef6 100644
> > --- a/arch/powerpc/boot/crt0.S
> > +++ b/arch/powerpc/boot/crt0.S
> > @@ -44,9 +44,6 @@ p_end:.long   _end
> >  p_pstack:  .long   _platform_stack_top
> >  #endif
> >
> > -   .globl  _zimage_start
> > -   /* Clang appears to require the .weak directive to be after the 
> > symbol
> > -* is defined. See https://bugs.llvm.org/show_bug.cgi?id=38921  */
> > .weak   _zimage_start
> >  _zimage_start:
> 
> Your explanation makes sense to me. I've added Alan to cc for his review.
> 
> Reviewed-by: Joel Stanley 
> 
> Thanks for the patch.
> 
> Cheers,
> 
> Joel
> 
> > .globl  _zimage_start_lib
> > --
> > 2.25.1.696.g5e7596f4ac-goog

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 2/2] powerpc/vmlinux.lds: Discard .interp section

2020-02-26 Thread Alan Modra
On Thu, Feb 27, 2020 at 03:59:33PM +1100, Michael Ellerman wrote:
> The .interp section specifies which "interpreter", ie. dynamic loader,
> the kernel requests. But that doesn't make any sense, the kernel is
> not a regular binary that is run with an interpreter.
> 
> The content seems to be some default value, this file doesn't even
> exist on my system:
>     2f 75 73 72 2f 6c 69 62  2f 6c 64 2e 73 6f 2e 31  
> |/usr/lib/ld.so.1|
> 
> So the section serves no useful purpose and consumes a small amount of
> space.
> 
> Also Alan Modra says we "likely could discard" it, so do so.

Yes, but you ought to check with the mimimum required binutils.  It is
quite possible that an older linker will blow up.

If the minimum required binutils is at least binutils-2.26 then
passing --no-dynamic-linker to ld is a more elegant solution.

> 
> Signed-off-by: Michael Ellerman 
> ---
>  arch/powerpc/kernel/vmlinux.lds.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
> b/arch/powerpc/kernel/vmlinux.lds.S
> index 31a0f201fb6f..619ffbaf72ad 100644
> --- a/arch/powerpc/kernel/vmlinux.lds.S
> +++ b/arch/powerpc/kernel/vmlinux.lds.S
> @@ -257,7 +257,6 @@ SECTIONS
>   }
>   .hash : AT(ADDR(.hash) - LOAD_OFFSET) { *(.hash) }
>   .gnu.hash : AT(ADDR(.gnu.hash) - LOAD_OFFSET) { *(.gnu.hash) }
> - .interp : AT(ADDR(.interp) - LOAD_OFFSET) { *(.interp) }
>   .rela.dyn : AT(ADDR(.rela.dyn) - LOAD_OFFSET)
>   {
>   __rela_dyn_start = .;
> @@ -370,5 +369,6 @@ SECTIONS
>   *(.gnu.version*)
>   *(.gnu.attributes)
>   *(.eh_frame)
> + *(.interp)
>   }
>  }
> -- 
> 2.21.1

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 1/2] powerpc/vmlinux.lds: Explicitly retain .gnu.hash

2020-02-26 Thread Alan Modra
On Thu, Feb 27, 2020 at 03:59:32PM +1100, Michael Ellerman wrote:
> Relocatable kernel builds produce a warning about .gnu.hash being an
> orphan section:
> 
>   ld: warning: orphan section `.gnu.hash' from `linker stubs' being placed in 
> section `.gnu.hash'
> 
> If we try to discard it the build fails:
> 
>   ld -EL -m elf64lppc -pie --orphan-handling=warn --build-id -o
> .tmp_vmlinux1 -T ./arch/powerpc/kernel/vmlinux.lds --whole-archive
> arch/powerpc/kernel/head_64.o arch/powerpc/kernel/entry_64.o
> ...
> sound/built-in.a net/built-in.a virt/built-in.a --no-whole-archive
> --start-group lib/lib.a --end-group
>   ld: could not find section .gnu.hash
> 
> So add an entry to explicitly retain it, as we do for .hash.

Looks fine to me.  You can also pass --hash-style=sysv to ld (since
binutils-2.18) to disable generation of .gnu.hash.

> 
> Signed-off-by: Michael Ellerman 
> ---
>  arch/powerpc/kernel/vmlinux.lds.S | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
> b/arch/powerpc/kernel/vmlinux.lds.S
> index b4c89a1acebb..31a0f201fb6f 100644
> --- a/arch/powerpc/kernel/vmlinux.lds.S
> +++ b/arch/powerpc/kernel/vmlinux.lds.S
> @@ -256,6 +256,7 @@ SECTIONS
>   *(.dynamic)
>   }
>   .hash : AT(ADDR(.hash) - LOAD_OFFSET) { *(.hash) }
> + .gnu.hash : AT(ADDR(.gnu.hash) - LOAD_OFFSET) { *(.gnu.hash) }
>   .interp : AT(ADDR(.interp) - LOAD_OFFSET) { *(.interp) }
>   .rela.dyn : AT(ADDR(.rela.dyn) - LOAD_OFFSET)
>   {
> -- 
> 2.21.1

-- 
Alan Modra
Australia Development Lab, IBM


Re: linux-next: build warnings from Linus' tree

2018-11-18 Thread Alan Modra
On Wed, Nov 14, 2018 at 09:20:23PM +1100, Michael Ellerman wrote:
> Joel Stanley  writes:
> > Hello Alan,
> >
> > On Tue, 12 Jun 2018 at 07:44, Stephen Rothwell  
> > wrote:
> >
> >> Building Linus' tree, today's linux-next build (powerpc ppc64_defconfig)
> >> produced these warning:
> >>
> >> ld: warning: orphan section `.gnu.hash' from `linker stubs' being placed 
> >> in section `.gnu.hash'.
> >> ld: warning: orphan section `.gnu.hash' from `linker stubs' being placed 
> >> in section `.gnu.hash'.
> >> ld: warning: orphan section `.gnu.hash' from `linker stubs' being placed 
> >> in section `.gnu.hash'.
> >>
> >> This may just be because I have started building using the native Debian
> >> gcc for the powerpc builds ...
> >
> > Do you know why we started creating these?
> 
> It's controlled by the ld option --hash-style, which AFAICS still
> defaults to sysv (generating .hash).
> 
> But it seems gcc can be configured to have a different default, and at
> least my native ppc64le toolchains are passing gnu, eg:
> 
>  /usr/lib/gcc/powerpc64le-linux-gnu/6/collect2 -plugin
>  /usr/lib/gcc/powerpc64le-linux-gnu/6/liblto_plugin.so
>  -plugin-opt=/usr/lib/gcc/powerpc64le-linux-gnu/6/lto-wrapper
>  -plugin-opt=-fresolution=/tmp/ccw1U2fF.res
>  -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s
>  -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc
>  -plugin-opt=-pass-through=-lgcc_s --sysroot=/ --build-id --eh-frame-hdr
>  -V -shared -m elf64lppc
>  --hash-style=gnu
>  
> 
> So that's presumably why we're seeing it, some GCCs are configured to
> use it.
> 
> > If it's intentional, should we be putting including them in the same
> > way as .hash sections?
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/kernel/vmlinux.lds.S#n282
> >
> >   .hash : AT(ADDR(.hash) - LOAD_OFFSET) { *(.hash) }
> 
> That would presumably work.
> 
> My question though is do we even need it?
> 
> >From what I can see for it to be useful you need the section as well as
> an entry in the dynamic section pointing at it, and we don't have a
> dynamic section at all:
> 
>   $ readelf -S vmlinux | grep gnu.hash
> [ 4] .gnu.hash GNU_HASH c0dbbdb0  00dcbdb0
>   $ readelf -d vmlinux
>   
>   There is no dynamic section in this file.
> 
> Compare to the vdso:
> 
> $ readelf -d arch/powerpc/kernel/vdso64/vdso64.so
> 
> Dynamic section at offset 0x868 contains 12 entries:
>   TagType Name/Value
>  0x000e (SONAME) Library soname: [linux-vdso64.so.1]
>  0x0004 (HASH)   0x120
>  0x6ef5 (GNU_HASH)   0x170
>  0x0005 (STRTAB) 0x320
>  0x0006 (SYMTAB) 0x1d0
>  0x000a (STRSZ)  269 (bytes)
>  0x000b (SYMENT) 24 (bytes)
>  0x7003 (PPC64_OPT)  0x0
>  0x6ffc (VERDEF) 0x450
>  0x6ffd (VERDEFNUM)  2
>  0x6ff0 (VERSYM) 0x42e
>  0x (NULL)   0x0
> 
> 
> So can't we just discard .gnu.hash? And in fact do we need .hash either?
> 
> Actually arm64 discards the latter, and parisc discards both.
> 
> Would still be good to hear from Alan or someone else who knows anything
> about toolchain stuff, ie. not me :)

.gnu.hash, like .hash, is used by glibc ld.so for dynamic symbol
lookup.  I imagine you don't need either section in a kernel, so
discarding both sounds reasonable.  Likely you could discard .interp
and .dynstr too, and .dynsym when !CONFIG_PPC32.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] powerpc/32: Include .branch_lt in data section

2018-11-18 Thread Alan Modra
On Thu, Nov 15, 2018 at 11:47:52PM +1100, Michael Ellerman wrote:
> Alan Modra  writes:
> 
> > On Wed, Nov 14, 2018 at 01:32:18PM +1030, Joel Stanley wrote:
> >> I wasn't sure where this should go or if the ordering matters.
> >
> > The usual answer is: "Look at where the section goes in the standard
> > linker scripts."   But that doesn't apply here.  The section will be
> > empty for a kernel build so it doesn't matter where it goes.
> 
> If it's empty why don't we just discard it?

That can be a recipe for finding linker bugs.  Not that I'm against
you finding linker bugs.  ;-)

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] powerpc/32: Include .branch_lt in data section

2018-11-13 Thread Alan Modra
On Wed, Nov 14, 2018 at 01:32:18PM +1030, Joel Stanley wrote:
> When building a 32 bit powerpc kernel with Binutils 2.31.1 this warning
> is emitted:
> 
>  powerpc-linux-gnu-ld: warning: orphan section `.branch_lt' from
>  `arch/powerpc/kernel/head_44x.o' being placed in section `.branch_lt'
> 
> As of binutils commit 2d7ad24e8726 ("Support PLT16 relocs against local
> symbols")[1], 32 bit targets can produce .branch_lt sections in their
> output.
> 
> Include these symbols in the .data section as the ppc64 kernel does.
> 
> [1] 
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=2d7ad24e8726ba4c45c9e67be08223a146a837ce
> Signed-off-by: Joel Stanley 
Reviewed-by: Alan Modra 

Looks fine to me.

> ---
> I wasn't sure where this should go or if the ordering matters.

The usual answer is: "Look at where the section goes in the standard
linker scripts."   But that doesn't apply here.  The section will be
empty for a kernel build so it doesn't matter where it goes.

> ---
>  arch/powerpc/kernel/vmlinux.lds.S | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
> b/arch/powerpc/kernel/vmlinux.lds.S
> index 434581bcd5b4..6d5fd1b95311 100644
> --- a/arch/powerpc/kernel/vmlinux.lds.S
> +++ b/arch/powerpc/kernel/vmlinux.lds.S
> @@ -313,6 +313,7 @@ SECTIONS
>   *(.sdata2)
>   *(.got.plt) *(.got)
>   *(.plt)
> +     *(.branch_lt)
>   }
>  #else
>   .data : AT(ADDR(.data) - LOAD_OFFSET) {
> -- 
> 2.19.1

-- 
Alan Modra
Australia Development Lab, IBM


Re: PIE binaries are no longer mapped below 4 GiB on ppc64le

2018-11-01 Thread Alan Modra
On Thu, Nov 01, 2018 at 02:55:34PM +1100, Michael Ellerman wrote:
> Hi Florian,
> 
> Florian Weimer  writes:
> > We tried to use Go to build PIE binaries, and while the Go toolchain is
> > definitely not ready (it produces text relocations and problematic
> > relocations in general), it exposed what could be an accidental
> > userspace ABI change.
> >
> > With our 4.10-derived kernel, PIE binaries are mapped below 4 GiB, so
> > relocations like R_PPC64_ADDR16_HA work:
> >
> > 21f0-220d r-xp  fd:00 36593493   
> > /root/extld
> > 220d-220e r--p 001c fd:00 36593493   
> > /root/extld
> > 220e-2210 rw-p 001d fd:00 36593493   
> > /root/extld
> ...
> >
> > With a 4.18-derived kernel (with the hashed mm), we get this instead:
> >
> > 120e6-12103 rw-p  fd:00 102447141
> > /root/extld
> > 12103-12106 rw-p 001c fd:00 102447141
> > /root/extld
> > 12106-12108 rw-p  00:00 0 
> 
> I assume that's caused by:
> 
>   47ebb09d5485 ("powerpc: move ELF_ET_DYN_BASE to 4GB / 4MB")
> 
> Which did roughly:
> 
>   -#define ELF_ET_DYN_BASE0x2000
>   +#define ELF_ET_DYN_BASE(is_32bit_task() ? 0x00040UL : \
>   +  0x1UL)
> 
> And went into 4.13.
> 
> > ...
> > I'm not entirely sure what to make of this, but I'm worried that this
> > could be a regression that matters to userspace.
> 
> It was a deliberate change, and it seemed to not break anything so we
> merged it. But obviously we didn't test widely enough.
> 
> So I guess it clearly can matter to userspace, and it used to work, so
> therefore it is a regression.
> 
> But at the same time we haven't had any other reports of breakage, so is
> this somehow specific to something Go is doing? Or did we just get lucky
> up until now? Or is no one actually testing on Power? ;)

Mapping PIEs above 4G should be fine.  It works for gcc C and C++
after all.  The problem is that ppc64le Go is generating code not
suitable for a PIE.  Dynamic text relocations are evidence of non-PIC
object files.

Quoting Lynn Boger :
"When building a pie binary with golang, they should be using
-buildmode=pie and not just pass -pie to the linker".

-- 
Alan Modra
Australia Development Lab, IBM


[PATCH] PowerPC/VDSO: Correct call frame information

2018-09-13 Thread Alan Modra
Call Frame Information is used by gdb for back-traces and inserting
breakpoints on function return for the "finish" command.  This failed
when inside __kernel_clock_gettime.  More concerning than difficulty
debugging is that CFI is also used by stack frame unwinding code to
implement exceptions.  If you have an app that needs to handle
asynchronous exceptions for some reason, and you are unlucky enough to
get one inside the VDSO time functions, your app will crash.

What's wrong:  There is control flow in __kernel_clock_gettime that
reaches label 99 without saving lr in r12.  CFI info however is
interpreted by the unwinder without reference to control flow: It's a
simple matter of "Execute all the CFI opcodes up to the current
address".  That means the unwinder thinks r12 contains the return
address at label 99.  Disabuse it of that notion by resetting CFI for
the return address at label 99.

Note that the ".cfi_restore lr" could have gone anywhere from the
"mtlr r12" a few instructions earlier to the instruction at label 99.
I put the CFI as late as possible, because in general that's best
practice (and if possible grouped with other CFI in order to reduce
the number of CFI opcodes executed when unwinding).  Using r12 as the
return address is perfectly fine after the "mtlr r12" since r12 on
that code path still contains the return address.

__get_datapage also has a CFI error.  That function temporarily saves
lr in r0, and reflects that fact with ".cfi_register lr,r0".  A later
use of r0 means the CFI at that point isn't correct, as r0 no longer
contains the return address.  Fix that too.

Signed-off-by: Alan Modra 
Tested-by: Reza Arbab 

diff --git a/arch/powerpc/kernel/vdso32/datapage.S 
b/arch/powerpc/kernel/vdso32/datapage.S
index 3745113fcc65..2a7eb5452aba 100644
--- a/arch/powerpc/kernel/vdso32/datapage.S
+++ b/arch/powerpc/kernel/vdso32/datapage.S
@@ -37,6 +37,7 @@ data_page_branch:
mtlrr0
addir3, r3, __kernel_datapage_offset-data_page_branch
lwz r0,0(r3)
+  .cfi_restore lr
add r3,r0,r3
blr
   .cfi_endproc
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S 
b/arch/powerpc/kernel/vdso32/gettimeofday.S
index 769c2624e0a6..1e0bc5955a40 100644
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
@@ -139,6 +139,7 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
 */
 99:
li  r0,__NR_clock_gettime
+  .cfi_restore lr
sc
blr
   .cfi_endproc
diff --git a/arch/powerpc/kernel/vdso64/datapage.S 
b/arch/powerpc/kernel/vdso64/datapage.S
index abf17feffe40..bf9668691511 100644
--- a/arch/powerpc/kernel/vdso64/datapage.S
+++ b/arch/powerpc/kernel/vdso64/datapage.S
@@ -37,6 +37,7 @@ data_page_branch:
mtlrr0
addir3, r3, __kernel_datapage_offset-data_page_branch
lwz r0,0(r3)
+  .cfi_restore lr
add r3,r0,r3
blr
   .cfi_endproc
diff --git a/arch/powerpc/kernel/vdso64/gettimeofday.S 
b/arch/powerpc/kernel/vdso64/gettimeofday.S
index c002adcc694c..a4ed9edfd5f0 100644
--- a/arch/powerpc/kernel/vdso64/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso64/gettimeofday.S
@@ -169,6 +169,7 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
 */
 99:
li  r0,__NR_clock_gettime
+  .cfi_restore lr
sc
blr
   .cfi_endproc

-- 
Alan Modra
Australia Development Lab, IBM


[PATCH] Correct PowerPC VDSO call frame info

2018-09-13 Thread Alan Modra
There is control flow in __kernel_clock_gettime that reaches label 99
without saving lr in r12.  CFI info however is interpreted by the
unwinder without reference to control flow: It's a simple matter of
"Execute all the CFI opcodes up to the current address".  That means
the unwinder thinks r12 contains the return address at label 99.
Disabuse it of that notion by resetting CFI for the return address at
label 99.

Note that the ".cfi_restore lr" could have gone anywhere from the
"mtlr r12" a few instructions earlier to the instruction at label 99.
I put the CFI as late as possible, because in general that's best
practice (and if possible grouped with other CFI in order to reduce
the number of CFI opcodes executed when unwinding).  Using r12 as the
return address is perfectly fine after the "mtlr r12" since r12 on
that code path still contains the return address.

__get_datapage also has a CFI error.  That function temporarily saves
lr in r0, and reflects that fact with ".cfi_register lr,r0".  A later
use of r0 means the CFI at that point isn't correct, as r0 no longer
contains the return address.  Fix that too.

Signed-off-by: Alan Modra 

diff --git a/arch/powerpc/kernel/vdso32/datapage.S 
b/arch/powerpc/kernel/vdso32/datapage.S
index 3745113fcc65..2a7eb5452aba 100644
--- a/arch/powerpc/kernel/vdso32/datapage.S
+++ b/arch/powerpc/kernel/vdso32/datapage.S
@@ -37,6 +37,7 @@ data_page_branch:
mtlrr0
addir3, r3, __kernel_datapage_offset-data_page_branch
lwz r0,0(r3)
+  .cfi_restore lr
add r3,r0,r3
blr
   .cfi_endproc
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S 
b/arch/powerpc/kernel/vdso32/gettimeofday.S
index 769c2624e0a6..1e0bc5955a40 100644
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
@@ -139,6 +139,7 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
 */
 99:
li  r0,__NR_clock_gettime
+  .cfi_restore lr
sc
blr
   .cfi_endproc
diff --git a/arch/powerpc/kernel/vdso64/datapage.S 
b/arch/powerpc/kernel/vdso64/datapage.S
index abf17feffe40..bf9668691511 100644
--- a/arch/powerpc/kernel/vdso64/datapage.S
+++ b/arch/powerpc/kernel/vdso64/datapage.S
@@ -37,6 +37,7 @@ data_page_branch:
mtlrr0
addir3, r3, __kernel_datapage_offset-data_page_branch
lwz r0,0(r3)
+  .cfi_restore lr
add r3,r0,r3
blr
   .cfi_endproc
diff --git a/arch/powerpc/kernel/vdso64/gettimeofday.S 
b/arch/powerpc/kernel/vdso64/gettimeofday.S
index c002adcc694c..a4ed9edfd5f0 100644
--- a/arch/powerpc/kernel/vdso64/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso64/gettimeofday.S
@@ -169,6 +169,7 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
 */
 99:
li  r0,__NR_clock_gettime
+  .cfi_restore lr
sc
blr
   .cfi_endproc

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RFC PATCH] powerpc: Fix dubious r0 usage

2017-02-17 Thread Alan Modra
On Fri, Feb 17, 2017 at 11:08:53PM +1100, Michael Ellerman wrote:
> Bleeding edge binutils no longer accepts r0 in places where the CPU
> interprets the value as a literal 0.

Wow!  That was quite some cleanup.  I think I'd better turn the error
into a warning..

> --- a/arch/powerpc/purgatory/trampoline.S
> +++ b/arch/powerpc/purgatory/trampoline.S
> @@ -67,7 +67,7 @@ master:
>   mr  %r16,%r3/* save dt address in reg16 */
>   li  %r4,20
>   LWZX_BE %r6,%r3,%r4 /* fetch __be32 version number at byte 20 */
> - cmpwi   %r0,%r6,2   /* v2 or later? */
> + cmpwi   0,%r6,2 /* v2 or later? */
>   blt 1f
>   li  %r4,28
>   STWX_BE %r17,%r3,%r4/* Store my cpu as __be32 at byte 28 */

With this one, it would probably be better to omit the zero (BF
field).

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] powerpc/boot: request no dynamic linker for boot wrapper

2016-11-27 Thread Alan Modra
On Mon, Nov 28, 2016 at 12:42:26PM +1100, Nicholas Piggin wrote:
> The boot wrapper performs its own relocations and does not require
> PT_INTERP segment.

OK, so the kernel change is quite reasonable in isolation, but see
below.

> Without this option, binutils 2.28 and newer tries to create a program
> header segment due to PT_INTERP, and the link fails because there is no
> space for it.
> 
>A recent binutils commit:
>
> https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=1a9ccd70f9a75dc6b48d340059f28ef3550c107b
>has broken kernel builds:

So this change added space for another header, it seems.  I suspect
that was accidental, particularly since there was no mention of
get_program_header_size in the ChangeLog entry.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 0/3] minor build fixes

2016-11-23 Thread Alan Modra
On Thu, Nov 24, 2016 at 12:02:06AM +1100, Nicholas Piggin wrote:
> I was building BookE and big endian with a little endian cross
> compiler and it stopped working. My BookS BE tests must have been
> building using the ELFv2 ABI. After this, the build sometimes still
> strangely fails with dot symbols in syscall table unable to be found,
> but that's looking like it may be a linker bug (Alan is going to take
> a look).

Yes it is a bug.  In compatibility code that was supposed to handle
mixing old object files that use dot-symbols on function entry with
newer object files that don't.  Here, "old" means mid 2004 or
earlier.

As you can imagine, I'm not hugely concerned about the ld bug..

Since every binutils back to at least 2.17 has the bug, what changed
in the kernel to expose it?  Are you building without -mcall-aixdesc?

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 2/5] kbuild: allow archs to select build for link dead code/data elimination

2016-08-08 Thread Alan Modra
On Mon, Aug 08, 2016 at 05:14:27PM +0200, Arnd Bergmann wrote:
> I have reverted that patch now, so ARM uses ".fixup" again like every
> other architecture does, and now "*(.fixup) *(.text .text.*)" works
> correctly, while ""*(.fixup) *(.text .fixup .text.*)" also fails
> the same way that I saw before:

That is really odd.  The linker isn't supposed to treat those script
snippets differently.  First match for .fixup wins.

$ cat > fixup1.s <<\EOF
 .global _start
 .text
_start:
 .dc.a .L2
.L1:
 .section ".fixup","ax",%progbits
.L2:
 .dc.a .L1
EOF
$ cat > fixup2.s <<\EOF
 .section ".text.xyz","ax",%progbits
 .dc.a .L2
.L1:

 .section ".fixup","ax",%progbits
.L2:
 .dc.a .L1
EOF
$ cat > fixup.lnk <<\EOF
SECTIONS {
  .text : { *(.fixup) *(.text .fixup .text.*) }
}
EOF
$ as -o fixup1.o fixup1.s 
$ as -o fixup2.o fixup2.s 
$ ld -o fixup -T fixup.lnk -Map fixup.map fixup1.o fixup2.o
$ cat fixup.map

Memory Configuration

Name Origin Length Attributes
*default*0x 0x

Linker script and memory map


.text   0x   0x10
 *(.fixup)
 .fixup 0x0x4 fixup1.o
 .fixup 0x00040x4 fixup2.o
 *(.text .fixup .text.*)
 .text  0x00080x4 fixup1.o
0x0008    _start
 .text  0x000c0x0 fixup2.o
 .text.xyz  0x000c0x4 fixup2.o
[snip]

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 2/5] kbuild: allow archs to select build for link dead code/data elimination

2016-08-07 Thread Alan Modra
On Sun, Aug 07, 2016 at 10:26:19PM +0200, Arnd Bergmann wrote:
> On Sunday, August 7, 2016 7:27:39 PM CEST Alan Modra wrote:
> > 
> > If it can, then Nicholas' patch should be:
> > 
> > *(.text.hot .text.hot.*) *(.text.unlikely .text.unlikely.*) *(.text 
> > .text.*)
> > 
> > If you can't put .text.fixup too far away then you may as well just use
> > 
> > *(.text .text.*)
> 
> I tried this version:
> 
> diff --git a/include/asm-generic/vmlinux.lds.h 
> b/include/asm-generic/vmlinux.lds.h
> index b1f8828e9eac..fc210dacac9a 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -438,7 +438,9 @@
>   * during second ld run in second ld pass when generating System.map */
>  #define TEXT_TEXT\
>   ALIGN_FUNCTION();   \
> - *(.text.hot .text .text.fixup .text.unlikely .text.*)   \
> + *(.text.hot .text.hot.*)\
> + *(.text.unlikely .text.fixup .text.unlikely.*)  \
> + *(.text .text.*)\
>   *(.ref.text)\
>   MEM_KEEP(init.text) \
>   MEM_KEEP(exit.text) \
> 
> but that failed to link an allyesconfig kernel because of references
> from .fixup to .text.*. Trying your version now:

Well then, that proves you can't put .text.fixup too far aways from
the associated input section.

> *(.text.hot .text.hot.*) *(.text.unlikely .text.unlikely.*) *(.text .text.*)

Which means this is guaranteed to fail when you test it properly using
gcc's profiling options, in order to generate .text.hot* and/or
.text.unlikely* sections.

It seems to me the right thing to do would be to change kernel asm to
generate .text.foo.fixup for any .text.foo section.  A gas feature
available with binutils-2.26 enabled by --sectname-subst might help
with implementing that.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 2/5] kbuild: allow archs to select build for link dead code/data elimination

2016-08-07 Thread Alan Modra
On Fri, Aug 05, 2016 at 10:12:00PM +1000, Nicholas Piggin wrote:
>  #define TEXT_TEXT\
>   ALIGN_FUNCTION();   \
> - *(.text.hot .text .text.fixup .text.unlikely)   \
> + *(.text.hot .text .text.fixup .text.unlikely .text.*)   \
>   *(.ref.text)\
>   MEM_KEEP(init.text) \
>   MEM_KEEP(exit.text) \

At the risk of being told you (kernel people) have already considerd
this I thought I should mention that the above isn't ideal.  (Nor is
gcc's choice of .text.hot for hot sections, which clashes with
--function-sections for a function called "hot" but that's another
story.)

You'd really like all the hot sections and cold sections to be
together, for better cache locality.  So the line ought to have been
*(.text.hot) *(.text) *(.text.fixup) *(.text.unlikely)

That would put all .text.hot sections together.  Similarly for
.text.unlikely.  The trap of course is that this only works if
.text.fixup from one object file can be placed relatively far away
from .text in the same object file.

If it can, then Nicholas' patch should be:

*(.text.hot .text.hot.*) *(.text.unlikely .text.unlikely.*) *(.text 
.text.*)

If you can't put .text.fixup too far away then you may as well just use

    *(.text .text.*)

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 1/5] kbuild: allow architectures to use thin archives instead of ld -r

2016-08-06 Thread Alan Modra
On Sun, Aug 07, 2016 at 11:49:46AM +1000, Stephen Rothwell wrote:
> Hi Sam,
> 
> On Sat, 6 Aug 2016 22:10:45 +0200 Sam Ravnborg <s...@ravnborg.org> wrote:
> >
> > Did you by any chance evalue the use of INPUT in linker files.
> > Stephen back then (again based on proposal from Alan Modra),
> > also made an implementation using INPUT.
> 
> The problem with that idea was that (at least for some versions of
> binutils in use at the time) we hit a static limit to the number of
> object files and ld just stopped at that point. :-(
> 
> > See below for an updated simple patch on top of mainline.
> 
> So, I guess it was fixed, but do we know in what version?

I think the patch was
https://sourceware.org/ml/binutils/2012-06/msg00201.html

So, you need binutils 2.23 or later.

-- 
Alan Modra
Australia Development Lab, IBM


Re: ppc64 sbrk returns executable heap in 32-bit emulation mode

2016-05-16 Thread Alan Modra
On Thu, May 12, 2016 at 03:41:09PM +0200, Florian Weimer wrote:
> We noticed that on ppc64, the sbrk system call in the 32-bit subsystem 
> returns executable memory.  I assume it is related to this, in 
> arch/powerpc/include/asm/page.h:
> 
> /*
>   * Unfortunately the PLT is in the BSS in the PPC32 ELF ABI,
>   * and needs to be executable.  This means the whole heap ends
>   * up being executable.
>   */
> #define VM_DATA_DEFAULT_FLAGS32 (VM_READ | VM_WRITE | VM_EXEC | \
>   VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
> 
> 
> What is the rationale for this?  This comment must be *really* old, 

I think the comment is just plain wrong.  ppc32 needs an executable
stack because it builds trampolines on the stack to support calling
nested functions.  I presume that's why the heap is executable.  (If
I'm wrong about heap+stack needing the same protection then I can't
think of any reason to require an executable heap.)

> because ld.so in glibc should make sure that the PLT is executable.  And 
> for current binaries, .bss is *not* executable, contrary to what the 
> comment suggests.
> 
> Is this comment about pre-ELF binaries?  If yes, would it possible to 
> change the default for ELF binaries?
> 
> Thanks,
> Florian

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v6 1/9] ppc64 (le): prepare for -mprofile-kernel

2016-01-27 Thread Alan Modra
On Wed, Jan 27, 2016 at 09:19:27PM +1100, Michael Ellerman wrote:
> Hi Torsten,
> 
> On Mon, 2016-01-25 at 16:26 +0100, Torsten Duwe wrote:
> > The gcc switch -mprofile-kernel, available for ppc64 on gcc > 4.8.5,
> > allows to call _mcount very early in the function, which low-level
> > ASM code and code patching functions need to consider.
> > Especially the link register and the parameter registers are still
> > alive and not yet saved into a new stack frame.
> > 
> > Signed-off-by: Torsten Duwe <d...@suse.de>
> > ---
> >  arch/powerpc/kernel/entry_64.S  | 45 
> > +++--
> >  arch/powerpc/kernel/ftrace.c| 12 +--
> >  arch/powerpc/kernel/module_64.c | 14 +
> >  3 files changed, 67 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> > index a94f155..e7cd043 100644
> > --- a/arch/powerpc/kernel/entry_64.S
> > +++ b/arch/powerpc/kernel/entry_64.S
> > @@ -1206,7 +1206,12 @@ _GLOBAL(enter_prom)
> >  #ifdef CONFIG_DYNAMIC_FTRACE
> >  _GLOBAL(mcount)
> >  _GLOBAL(_mcount)
> > -   blr
> > +   std r0,LRSAVE(r1) /* gcc6 does this _after_ this call _only_ */
> > +   mflrr0
> > +   mtctr   r0
> > +   ld  r0,LRSAVE(r1)
> > +   mtlrr0
> > +   bctr
> 
> Can we use r11 instead? eg:
> 
> _GLOBAL(_mcount)
>   mflrr11
>   mtctr   r11
>   mtlrr0
>   bctr

Depends on what you need to support.  As Torsten says, the code to
call _mcount when -mprofile-kernel is emitted before the prologue of a
function (similar to -m32), but after the ELFv2 global entry point
code.  If you trash r11 here you're killing the static chain pointer,
used by C nested functions or other languages that use a static chain,
eg. Pascal.  r11 has *not* been saved for ELFv2.

r12 might be a better choice for a temp reg.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info

2015-01-07 Thread Alan Modra
On Wed, Jan 07, 2015 at 04:12:47PM +1100, Anton Blanchard wrote:
 Hi Alan,
 
  Right.  This is really an rs6000 backend bug.  We describe one of the
  indirect calls that go wrong here as
  
  (call_insn 108 107 109 13 (parallel [
  (set (reg:DI 3 3)
  (call (mem:SI (reg:DI 288) [0 *_67 S4 A8])
  (const_int 64 [0x40])))
  (use (mem:DI (plus:DI (reg/f:DI 287 [ ops_44(D)-update ])
  (const_int 8 [0x8])) [0  S8 A8]))
  (set (reg:DI 2 2)
  (mem/v/c:DI (plus:DI (reg/f:DI 1 1)
  (const_int 40 [0x28])) [0  S8 A8]))
  (clobber (reg:DI 65 lr))
  ]) net/core/skbuff.c:2085 680 {*call_value_indirect_aixdi}
  notes and arg uses omitted for clarity
  )
  
  Notice that the RTL contains a parallel.  As you might guess, gcc
  treats the vector of expressions inside the square brackets of the
  parallel as happening in parallel.  Meaning that as far as gcc is
  concerned the toc restore part (third element) happens at the same
  time as the call (first element).  So if gcc replaces (reg:DI 1) in
  the toc restore with some other register known to have the same value
  *before* the call, gcc's RTL analysis will conclude that such a
  replacement is valid.
 
 Thanks for looking into this. Does that mean we were just getting lucky
 with the previous version:
 
 static inline struct thread_info *current_thread_info(void)
 {
 register unsigned long sp asm(r1);
 
 return (struct thread_info *)(sp  ~(THREAD_SIZE-1));
 }

With both versions, the original rtl for current_thread_info consists
of two instructions, copy r1 to a pseudo reg, then the and.  With
the above version, fwprop1 manages to combine this to a single and
insn.  When using a global reg var, fprop1 leaves the two
instructions, the copy causing the trouble in the following cprop1
pass.  So it's not that we are getting lucky in cprop1, but that
fwprop1 behaves differently with global vs. local registers.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info

2014-12-31 Thread Alan Modra
 describe one of the
indirect calls that go wrong here as

(call_insn 108 107 109 13 (parallel [
(set (reg:DI 3 3)
(call (mem:SI (reg:DI 288) [0 *_67 S4 A8])
(const_int 64 [0x40])))
(use (mem:DI (plus:DI (reg/f:DI 287 [ ops_44(D)-update ])
(const_int 8 [0x8])) [0  S8 A8]))
(set (reg:DI 2 2)
(mem/v/c:DI (plus:DI (reg/f:DI 1 1)
(const_int 40 [0x28])) [0  S8 A8]))
(clobber (reg:DI 65 lr))
]) net/core/skbuff.c:2085 680 {*call_value_indirect_aixdi}
notes and arg uses omitted for clarity
)

Notice that the RTL contains a parallel.  As you might guess, gcc
treats the vector of expressions inside the square brackets of the
parallel as happening in parallel.  Meaning that as far as gcc is
concerned the toc restore part (third element) happens at the same
time as the call (first element).  So if gcc replaces (reg:DI 1) in
the toc restore with some other register known to have the same value
*before* the call, gcc's RTL analysis will conclude that such a
replacement is valid.

That's what happens in the cprop1 pass.  The rtl dump shows
LOCAL COPY-PROP: Replacing reg 1 in insn 108 with reg 203
and then it's a matter of luck just what hard register is allocated to
pseudo-reg 203.

Of course, replacing r1 with some other register is a completely
useless thing to do, but trying to tell gcc that in our particular
case we want this generic optimisation disabled isn't so easy.  (Well,
it's dead easy if you want to hack cprop.c:do_local_cprop, just rip
out
  || (GET_CODE (PATTERN (insn)) != USE
   asm_noperands (PATTERN (insn))  0)))
but maybe not so easy to get such patches committed..)  Instead, the
way I'd go about fixing this is removing the r1 reference in our toc
save/restore RTL, ie. don't use a mem, use an unspec.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/ppc64: Allow allmodconfig to build (finally !)

2014-05-15 Thread Alan Modra
On Wed, May 14, 2014 at 08:34:30AM -0700, Guenter Roeck wrote:
 Bummer. Confirmed, if I replace @h with @high in just one place,
 the builds pass with binutils 2.24. Unfortunately the same builds then
 fails with binutils 2.23.
 
 Any idea how to get it to compile with both old and new versions ?

The standard way with GNU software would be to write a configure test,
that checks for @high support in the assembler, and defines a macro
if the assembler passes the check.  I'm not that familiar with the
linux kernel these days, but a little grepping around says that
something like

ashigh := $(call as-instr,lis 9$(comma)foo@high,-DHAVE_AS_ATHIGH=1)
KBUILD_AFLAGS += $(ashigh)

might work.

 Is there some predefined constant which I could possibly use for
 something like
 
 .if as_version_below_2.24_
   orisreg,reg,(expr)@h;
 .else
   orisreg,reg,(expr)@high;
 .endif
 
 Thanks,
 Guenter

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/ppc64: Allow allmodconfig to build (finally !)

2014-05-13 Thread Alan Modra
On Wed, May 14, 2014 at 01:34:34PM +1000, Stephen Rothwell wrote:
 OK, this appears to be an assembler bug.

Agreed.  Upgrade binutils!

 $ cat test.s
   .text
 x:
   .pushsectionb, a
   beq y
   .popsection
   .=0x8
 y:
 $ /opt/cross/gcc-4.6.3-nolibc/powerpc64-linux/bin/powerpc64-linux-as --version
 GNU assembler (GNU Binutils) 2.22
 This assembler was configured for a target of `powerpc64-linux'.
 $ /opt/cross/gcc-4.6.3-nolibc/powerpc64-linux/bin/powerpc64-linux-as -o 
 test.o test.s 
 test.s: Assembler messages:
 test.s:4: Error: operand out of range (0x0008 is not between 
 0x8000 and 0x7ffc)
 $ /opt/cross/gcc-4.8.1-nolibc/powerpc64-linux/bin/powerpc64-linux-as --version
 GNU assembler (GNU Binutils) 2.23.52.20130512
 This assembler was configured for a target of `powerpc64-linux'.
 $ /opt/cross/gcc-4.8.1-nolibc/powerpc64-linux/bin/powerpc64-linux-as -o 
 test.o test.s 
 (no error)
 
 Alan, can you shed light on when it was fixed?

2012-11-05
https://sourceware.org/ml/binutils/2012-11/msg00043.html
git show 3b8b57a9495016b2b02fbc2612dd1607d4b6f9ba

The part that actually fixes this problem is Leave insn field zero

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/ppc64: Allow allmodconfig to build (finally !)

2014-05-13 Thread Alan Modra
On Tue, May 13, 2014 at 10:16:51PM -0700, Guenter Roeck wrote:
 any idea what might cause this one, by any chance ?
 
 arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
 (.text+0x165ee): relocation truncated to fit: R_PPC64_ADDR16_HI against 
 symbol `interrupt_base_book3e' defined in .text section in 
 arch/powerpc/kernel/built-in.o
 arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
 (.text+0x16602): relocation truncated to fit: R_PPC64_ADDR16_HI against 
 symbol `interrupt_end_book3e' defined in .text section in 
 arch/powerpc/kernel/built-in.o
 arch/powerpc/kernel/built-in.o: In function `exc_debug_debug_book3e':
 
 I see this if I try to build powerpc:ppc64e_defconfig or 
 powerpc:chroma_defconfig
 with gcc 4.8.2 and binutils 2.24.

Blame me.  I changed the ABI, something that had to be done but
unfortunately happens to break the booke kernel code.  When building
up a 64-bit value with lis, ori, shl, oris, ori or similar sequences,
you now should use @high and @higha in place of @h and @ha.  @h and
@ha (and their associated relocs R_PPC64_ADDR16_HI and
R_PPC64_ADDR16_HA) now report overflow if the value is out of 32-bit
signed range.  ie. @h and @ha assume you're building a 32-bit value.
This is needed to report out-of-range -mcmodel=medium toc pointer
offsets in @toc@h and @toc@ha expressions, and for consistency I did
the same for all other @h and @ha relocs.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [git pull] Please pull abiv2 branch

2014-04-28 Thread Alan Modra
On Mon, Apr 28, 2014 at 04:39:00PM +0200, Philippe Bergheaud wrote:
 Kernel will not load modules because TOC. has no CRC.
 Is this expected ? Shouldn't TOC. have a CRC ?

TOC. is really .TOC.  (The kernel build process strips off a leading
dot from symbol names.)  .TOC. is a special symbol giving the toc
base, so no, it shouldn't have a CRC.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Change MINSIGSTKSZ and SIGSTKSIZE

2014-04-14 Thread Alan Modra
On Sat, Apr 12, 2014 at 02:59:55AM -0700, Haren Myneni wrote:
 Alan, 
   LTP test (signalstack02) is failing.

Ignore the stupid test.  Do *NOT* change the kernel value to match
glibc.  The two sets of constants are independent.

The glibc constants are for user code, to allocate correctly sized
signal stacks for all known kernels.  These constants are therefore
the maximum values needed for known kernels.

The kernel constants are to check that user code is setting up a large
enough signal stack *for that specific kernel, and user binary*.  If
you make the kernel constants match current glibc, then old binaries
that don't use htm or vsx will fail when run on a newer kernel.

Ideally the kernel would detect whether a binary was going to use htm
or vsx and adjust the minimum sizes, but it's a wee bit difficult for
the kernel to know that ahead of time.

 This test expects -ENOMEM from
 kernel when passing less than stack size (passing 4095). MINSIGSTKSZ in
 signal.h (glibc) is changed to 4096 to support VSX changes
 (https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=f7c399cff5bd04ee9dc117fb6b0f39597dc047c6)
   We should also change these values in kernel to match with glibc. Any 
 issues? 
 
 diff --git a/arch/powerpc/include/uapi/asm/signal.h
 b/arch/powerpc/include/uapi/
 index 6c69ee9..18f498e 100644
 --- a/arch/powerpc/include/uapi/asm/signal.h
 +++ b/arch/powerpc/include/uapi/asm/signal.h
 @@ -85,8 +85,8 @@ typedef struct {
  
  #define SA_RESTORER0x0400U
  
 -#define MINSIGSTKSZ2048
 -#define SIGSTKSZ   8192
 +#define MINSIGSTKSZ4096
 +#define SIGSTKSZ   16384
 
 Thanks
 Haren

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 10/33] powerpc: Ignore TOC relocations

2014-03-26 Thread Alan Modra
On Tue, Mar 25, 2014 at 10:44:16PM +1100, Anton Blanchard wrote:
 The linker fixes up TOC. relocations, so prom_init_check.sh should
 ignore them.

Err, .TOC. you mean.  Presumably something strips off the leading dot
somewhere?

 -btext_setup_display
 +btext_setup_display TOC.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 15/33] powerpc: Fix ABIv2 issues with stack offsets in assembly code

2014-03-26 Thread Alan Modra
On Tue, Mar 25, 2014 at 10:44:21PM +1100, Anton Blanchard wrote:
 Fix STK_PARAM and use it instead of hardcoding ABIv1 offsets.

  _GLOBAL(memcpy)
  BEGIN_FTR_SECTION
 - std r3,48(r1)   /* save destination pointer for return value */
 + std r3,STK_PARAM(R3)(r1)/* save destination pointer for return 
 value */

Here and elsewhere you're assuming you have a parameter save area.
That won't be true with ELFv2 for calls to functions like memcpy.

typedef __SIZE_TYPE__ size_t;
extern void *memcpy (void *dest, const void *src, size_t n);

void foo (void *dest, const void *src, size_t n)
{
  memcpy (dest, src, n);
}

foo:
0:  addis 2,12,.TOC.-0b@ha
addi 2,2,.TOC.-0b@l
.localentry foo,.-foo
mflr 0
std 0,16(1)
stdu 1,-32(1)   # 
bl memcpy
nop
addi 1,1,32
ld 0,16(1)
mtlr 0
blr

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 15/33] powerpc: Fix ABIv2 issues with stack offsets in assembly code

2014-03-26 Thread Alan Modra
On Wed, Mar 26, 2014 at 08:34:49PM +1030, Alan Modra wrote:
 On Tue, Mar 25, 2014 at 10:44:21PM +1100, Anton Blanchard wrote:
  Fix STK_PARAM and use it instead of hardcoding ABIv1 offsets.
 
   _GLOBAL(memcpy)
   BEGIN_FTR_SECTION
  -   std r3,48(r1)   /* save destination pointer for return value */
  +   std r3,STK_PARAM(R3)(r1)/* save destination pointer for return 
  value */
 
 Here and elsewhere you're assuming you have a parameter save area.
 That won't be true with ELFv2 for calls to functions like memcpy.

Nevermind, I see you fixed that with the next patch..

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 27/33] powerpc: Handle new ELFv2 module relocations

2014-03-26 Thread Alan Modra
On Tue, Mar 25, 2014 at 10:44:33PM +1100, Anton Blanchard wrote:
 From: Rusty Russell ru...@rustcorp.com.au
 + case R_PPC64_REL16_HA:
 + /* Subtract location pointer */
 + value -= (unsigned long)location;
 + value = ((value + 0x8000)  16);
 + *((uint16_t *) location)
 + = (*((uint16_t *) location)  ~0x)
 + | (value  0x);

There's not much point reading the uint16_t.

*(uint16_t *) location = value;

 + break;
 +
 + case R_PPC64_REL16_LO:
 + /* Subtract location pointer */
 + value -= (unsigned long)location;
 + *((uint16_t *) location)
 + = (*((uint16_t *) location)  ~0x)
 + | (value  0x);

and again.

 + break;
 +
   default:
   printk(%s: Unknown ADD relocation: %lu\n,
  me-name,

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: Work around gcc miscompilation of __pa() on 64-bit

2013-09-01 Thread Alan Modra
On Mon, Sep 02, 2013 at 09:59:12AM +1000, Benjamin Herrenschmidt wrote:
 On Tue, 2013-08-27 at 16:42 +0930, Alan Modra wrote:
  The proper fix is to define a whole slew of new relocations and reloc
  specifiers, and modify everything to use them, but that seems like too
  much bother.  I had ideas once upon a time to implement gas and ld
  options that makes @ha and _HA report overflows, but haven't found one
  of those round tuits.
 
 No, if you don't have a reloc that can represent this, then the proper
 fix is to use the existing relocs to load the original symbol address
 into a register, then *generate* the appropriate 64-bit addition on top
 of it.

I already have a gcc fix to do exactly that.  My proper fix comment
was more to do with the general case.  For example, when linking a
huge object that overflows _HA relocs right now we silently generate
bad code.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] powerpc: Work around gcc miscompilation of __pa() on 64-bit

2013-08-27 Thread Alan Modra
On Tue, Aug 27, 2013 at 04:07:49PM +1000, Paul Mackerras wrote:
 On 64-bit, __pa(static_var) gets miscompiled by recent versions of
 gcc as something like:
 
 addis 3,2,.LANCHOR1+4611686018427387904@toc@ha
 addi 3,3,.LANCHOR1+4611686018427387904@toc@l

I might argue that this isn't a miscompilation, since -mcmodel=medium
assumes everything can be accessed within +/-2G of the toc pointer,
but it's definitely a problem since gas and/or ld don't give an
overflow error.  They would except for the fact that our ABI has a
hole in it.

We have relocs that error on 16-bit overflow, eg.
  addi 3,2,x@toc
will give an error if x is more than +/-32k from the toc pointer, but
@ha and _HA/_HI relocs don't error on 32-bit overflow.  (They can't,
because they were really designed to be used in HIHGESTA, HIGHERA, HA,
LO sequences to build up 64-bit values.)

The proper fix is to define a whole slew of new relocations and reloc
specifiers, and modify everything to use them, but that seems like too
much bother.  I had ideas once upon a time to implement gas and ld
options that makes @ha and _HA report overflows, but haven't found one
of those round tuits.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: BUG_ON and gcc don't mix

2013-08-19 Thread Alan Modra
On Tue, Aug 20, 2013 at 12:37:50PM +1000, Anton Blanchard wrote:
 address of the trap instruction for our bug exception table. Maybe
 we need a gcc builtin in which we can get a label on the trap
 instruction. Would that be possible?

Not your actual _EMIT_BUG_ENTRY, but something like this ought to work.
The only trick here is not putting anything after __builtin_trap()..

#define BUG_ON(x) do { \
if (x) {\
__asm__ __volatile__ (\n1:\
\t.section __bug_table,\a\  \
\n\t.long 1b  \
\n\t.previous);   \
__builtin_trap();   \
}   \
} while (0)

int foo(unsigned int *bar)
{
unsigned int holder_cpu;

holder_cpu = *bar  0x;
BUG_ON(holder_cpu = 32);

return 1;
}

-- 
Alan Modra
Australia Development Lab, IBM



-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: BUG_ON and gcc don't mix

2013-08-19 Thread Alan Modra
On Tue, Aug 20, 2013 at 02:02:11PM +0930, Alan Modra wrote:
 On Tue, Aug 20, 2013 at 12:37:50PM +1000, Anton Blanchard wrote:
  address of the trap instruction for our bug exception table. Maybe
  we need a gcc builtin in which we can get a label on the trap
  instruction. Would that be possible?
 
 Not your actual _EMIT_BUG_ENTRY, but something like this ought to work.
 The only trick here is not putting anything after __builtin_trap()..

Doh!  I guess the whole point was to get the condition folded into the
trap, which is foiled by emitting an asm between the condition and
buildin_trap().

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: SIGSTKSZ/MINSIGSTKSZ too small on 64bit

2013-07-26 Thread Alan Modra
On Fri, Jul 26, 2013 at 04:31:34PM -0500, Ryan Arnold wrote:
 Adhemerval and I were just looking at the signal stack frames and I'd
 noticed the increase in size due to the addition of the HTM bits so this is
 great timing.
 
 I tried a sigstack.h patch that increased the values as you indicated and
 it cleaned up the failing tst-cancel21* testcases on POWER8.  I didn't try
 it on POWER7 yet.

I've tested on power7 using a copy of
sysdeps/unix/sysv/linux/sparc/bits/sigstack.h
as
sysdeps/unix/sysv/linux/powerpc/bits/sigstack.h

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: SIGSTKSZ/MINSIGSTKSZ too small on 64bit

2013-07-25 Thread Alan Modra
On Fri, Jul 26, 2013 at 12:23:25PM +1000, Anton Blanchard wrote:
 
 Hi,
 
 Alan has been looking at a glibc test fail. His analysis shows SEGVs
 in signal handlers using sigaltstack, and that MINSIGSTKSZ and SIGSTKSZ
 are too small.
 
 We increased the size of rt_sigframe in commit 2b0a576d15e0
 (powerpc: Add new transactional memory state to the signal context) but
 didn't bump either SIGSTKSZ and MINSIGSTKSZ. We need to do that in both
 the kernel and glibc, but I'm a bit worried we could have broken
 existing applications that use sigaltstack.

Before VSX changes, struct rt_sigframe size was 1920 plus 128 for
__SIGNAL_FRAMESIZE giving ppc64 exactly the default MINSIGSTKSZ of
2048.

After VSX, ucontext increased by 256 bytes.  Oops, we're over
MINSIGSTKSZ.  Add another ucontext for TM and rt_sigframe is now at
3872, giving actual MINSIGSTKSZ of 4000.

The glibc testcase that I was looking at was tst-cancel21, which
allocates 2*SIGSTKSZ (not because the test is trying to be
conservative, but because the test actually has nested signal stack
frames).  We blew the allocation by 48 bytes when using current
mainline gcc to compile glibc (le ppc64).

The required stack depth in _dl_lookup_symbol_x from the top of the
next signal frame was 10944 bytes.  I guess you'd want to add 288 to
that, implying an actual SIGSTKSZ of 11232.

I think we want
#define MINSIGSTKSZ 4096
#define SIGSTKSZ16384

frame size  r1
#0  0x295cdaec in _dl_lookup_symbol_x(memset)   190
#1  0x295d3c4c in _dl_fixup()b0 10003310160
#2  0x295dc818 in _dl_runtime_resolve()  b0 10003310210
#3  0x1f59ea8c in uw_init_context_1()   a30 100033102c0
#4  0x1f59f560 in libc:_Unwind_ForcedUnwind()   c90 10003310cf0
#5  0x1ffb9538 in pt:_Unwind_ForcedUnwind()  90 10003311980
#6  0x1ffb6418 in __pthread_unwind() 70 10003311a10
#7  0x1ffaaeb0 in sigcancel_handler()70 10003311a80
#8  signal handler called 1ffe0448 tramp  fa0 10003311af0
10003311b70 rt_sigframe
  10003311c58 sigcontext.gp_regs
  10003311dd8 sigcontext.fp_regs
  10003311ee0 sigcontext.v_regs
  10003311ef0 sigcontext.vmx
100033128d8 rt_sigframe.pinfo  offset d68
10003312968 rt_sigframe.abigap
10003312a88 end + 8 alignment
#9  0x1ffb6f9c in80 10003312a90
#10 0x1ffb6f84 in   10003312b10
#11 0x100020f4 in delete_temp_files()80 10003312dc0
#12 0x10002198 in   10003313070
#13 signal handler called
#14 0x1ffb6f9c in ?? ()
#15 0x1ffb6f84 in ?? ()
#16 0x10002274 in ?? ()
#17 0x10002430 in ?? ()
#18 0x10002644 in ?? ()
#19 0x10001a1c in ?? ()
#20 0x1fe17f0c in ?? ()
#21 0x1fe18134 in ?? ()
#22 0x in ?? ()


-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] powerpc: provide __bswapdi2

2013-05-13 Thread Alan Modra
On Mon, May 13, 2013 at 04:48:19PM +1000, Anton Blanchard wrote:
 On Fri, 10 May 2013 22:18:27 +0100
 David Woodhouse dw...@infradead.org wrote:
 
  From: David Woodhouse david.woodho...@intel.com
  
  Some versions of GCC apparently expect this to be provided by libgcc.
 
 Thanks Dave. We were discussing this with Alan Modra and he doesn't
 think the 64bit target should ever emit a call to __bswapdi2. Did you
 only see it on 32bit, or 64bit as well?
 
 Alan: I notice Dave is adding calls to __builtin_bswap, perhaps some
 versions of the 64bit compiler did emit __bswapdi2 calls for that.

I did a little digging, and it looks like gcc-4.4 will emit __bswapdi2
calls.  Support in rs6000.md appeared 2009-06-25.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH net-next] af_unix: fix a fatal race with bit fields

2013-05-02 Thread Alan Modra
On Tue, Apr 30, 2013 at 10:04:32PM -0700, Eric Dumazet wrote:
 These kind of errors are pretty hard to find, its a pity to spend time
 on them.

Well, yes.  From the first comment in gcc PR52080.  For the following
testcase we generate a 8 byte RMW cycle on IA64 which causes locking
problems in the linux kernel btrfs filesystem.

Did someone fix btrfs, but not check other kernel locks?  Having now
hit the same problem again, have you checked that other kernel locks
don't have adjacent bit fields in the same 64-bit word?  And comment
the struct to ensure someone doesn't optimize those unsigned chars
back to bit fields.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH net-next] af_unix: fix a fatal race with bit fields

2013-04-30 Thread Alan Modra
On Tue, Apr 30, 2013 at 07:24:20PM -0700, Eric Dumazet wrote:
   li 11,1
   ld 0,0(9)
   rldimi 0,11,31,32
   std 0,0(9)
   blr
   .ident  GCC: (GNU) 4.6.3
 
 You can see ld 0,0(9) is used : its a 64 bit load.

Yup.  This is not a powerpc64 specific problem.  See
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080
Fixed in 4.8.0 and 4.7.3.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: build failure after merge of the final tree

2012-07-06 Thread Alan Modra
On Fri, Jul 06, 2012 at 01:01:37PM +1000, Stephen Rothwell wrote:
 solos-pci.c:(.text+0x1ff923c): relocation truncated to fit: R_PPC64_REL24
 ^

 I assume at this point, we are just too large.

Yeah, but not in total.  I didn't see any of these in the allyes
kernel I built with our proof of concept hack to avoid ld -r.  I think
you'll find that these are all from ld -r output, as I assume no one
in kernel land writes drivers or whatever with 33M of text in a single
file.  Branches in that monstrous section can't even reach the
trampolines that ld inserts to extend branch reach.  Did I mention
that ld -r is a bad idea?

One workaround might be to compile with -ffunction-sections.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: build failure after merge of the final tree

2012-07-05 Thread Alan Modra
On Thu, Jul 05, 2012 at 06:33:45PM +1000, Stephen Rothwell wrote:
 powerpc64-linux-ld: drivers/built-in.o: In function `.gpiochip_is_requested':
 (.text+0x4): sibling call optimization to `_savegpr0_29' does not allow 
 automatic multiple TOCs; recompile with -mminimal-toc or 
 -fno-optimize-sibling-calls, or make `_savegpr0_29' extern
 
 I got more than 6 of these messages before I killed the link. :-(  I
 am not sure what has changed to do this, but it may have been masked for
 the past few releases due to other linking problems.

Let me guess.  You're using bleeding edge gcc but not binutils.

a) Recent gcc has fixed prologue and epilogue generation which now
   properly makes use of out-of-line register save and restore
   functions when compiling with -Os.
b) Recent ld doesn't emit out-of-line save/restore function for ld -r,
   but yours does.  You need my 2012-06-22 patch.
c) Kernel uses ld -r for packaging.

(b) and (c) together mean you get a definition for _savegpr0_29 munged
together with other functions.  That's bad.  If _savegpr0_29 wasn't
emitted until the final link stage then it would be in a code section
containing just save/restore functions.  ld will analyse that section
and notice the absense of toc relocations; functions therein don't
use the toc and can thus be called from any toc group without needing
a toc adjusting stub.  In your case _savegpr0_29 is in a section that
has toc relocations (from normal compiled code), so ld decides that
any function in that section must have a proper value for the toc
register.  But calls to _savegpr0_29 don't have a following nop to
overwrite with a toc restore insn, hence the ld error.

Score another black mark for ld -r.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: build failure after merge of the final tree

2012-07-05 Thread Alan Modra
On Fri, Jul 06, 2012 at 10:21:51AM +1000, Stephen Rothwell wrote:
 which have now been fixed.  So would a simple patch that puts the
 _savegpr etc functions in their own section (defined how?) fix this for
 us?

Ah, the kernel provides its own save/restore functions, and these get
mashed into a .text containing normal functions with toc references by
ld -r.  Well, you could stop using ld -r.  Otherwise, try

 .section .text.save.restore,ax,@progbits

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: build failure after merge of the final tree (powerpc related)

2012-06-21 Thread Alan Modra
On Thu, Jun 21, 2012 at 05:38:27PM +1000, Michael Ellerman wrote:
 On Thu, 2012-06-21 at 17:07 +1000, Michael Ellerman wrote:
  On Thu, 2012-06-21 at 16:24 +1000, Benjamin Herrenschmidt wrote:
   On Thu, 2012-06-21 at 15:36 +1000, Michael Ellerman wrote:

powerpc64-linux-ld: 
/src/next/net/openvswitch/vport-netdev.c:189:(.text+0x89b990): 
sibling call optimization to `_restgpr0_28' does not allow 
automatic multiple TOCs;
recompile with -mminimal-toc or -fno-optimize-sibling-calls, or 
make `_restgpr0_28' extern


Linker bug.  That's not a sibling call, but a normal function return
via an out-of-line register restore function.  Will fix.  I'm a bit
surprised to see this with gcc-4.6 though.  Or does this gcc-4.6 have
some of my recent mainline gcc patches enabling out-of-line
save/restore functions for -Os?

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: build failure after merge of the final tree (powerpc related)

2012-06-21 Thread Alan Modra
On Thu, Jun 21, 2012 at 08:18:39PM +0930, Alan Modra wrote:
 Linker bug.  That's not a sibling call, but a normal function return
 via an out-of-line register restore function.

I couldn't see how this might be occurring, then I remembered the
kernel has this horrible practise of using ld -r to package object
files.  So linker generated functions might be munged together with
other functions.  Does this help?  (It won't if the kernel is
providing its own save/restore functions.)

Index: bfd/elf64-ppc.c
===
RCS file: /cvs/src/src/bfd/elf64-ppc.c,v
retrieving revision 1.387
diff -u -p -r1.387 elf64-ppc.c
@@ -6494,9 +6494,10 @@ ppc64_elf_func_desc_adjust (bfd *obfd AT
 
   /* Provide any missing _save* and _rest* functions.  */
   htab-sfpr-size = 0;
-  for (i = 0; i  sizeof (funcs) / sizeof (funcs[0]); i++)
-if (!sfpr_define (info, funcs[i]))
-  return FALSE;
+  if (!info-relocatable)
+for (i = 0; i  sizeof (funcs) / sizeof (funcs[0]); i++)
+  if (!sfpr_define (info, funcs[i]))
+   return FALSE;
 
   elf_link_hash_traverse (htab-elf, func_desc_adjust, info);
 


-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] powerpc: Optimise per cpu accesses on 64bit

2010-06-01 Thread Alan Modra
On Tue, Jun 01, 2010 at 05:05:20PM +1000, Benjamin Herrenschmidt wrote:
 On Tue, 2010-06-01 at 14:45 +1000, Anton Blanchard wrote:
  Now we dynamically allocate the paca array, it takes an extra load
  whenever we want to access another cpu's paca. One place we do that a lot
  is per cpu variables. A simple example:
 
 Can't we dedicate a GPR instead ? Or it isn't worth it ? Something we
 almost never use in the kernel like r12 ?

Not r12.  It is used in function prologue and epilogue code.  If you
want a dedicated gpr I think you'll need to use (and lose) one of the
non-volatile regs.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PowerPC ftrace function trace optimisation

2010-04-28 Thread Alan Modra
On Thu, Apr 29, 2010 at 11:02:47AM +1000, Benjamin Herrenschmidt wrote:
 From a quick test it appears that this only works with -m64, not -m32.
 Alan is that correct ?

Yes.

 Any chance you can fix that in future gcc versions ?

No need really.  32-bit _mcount calls happen before the prologue
anyway.

  In fact if we are careful when switching to the new mcount ABI and don't
  rely on the store of r0, we could probably optimise this even further in a
  future gcc and remove the store completely. mcount would be 2 instructions:
  
 mflrr0  
 bl  8 .foo+0x8

Yeah.  Also, I should have used a different name for this mcount from
the standard 64-bit mcount.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: binutils 2.19 issue with kernel link

2009-07-10 Thread Alan Modra
On Fri, Jul 10, 2009 at 10:27:26AM -0500, Kumar Gala wrote:
 binutils-2.19 _end is what we expect
 binutils-2.19.1   _end is what we expect
 binutils-2.19.50.0.1  _end is what we expect
 binutils-2.19.51.0.1  _end is 1000

 From the release notes:

 binutils-2.19.50.0.1 is based on CVS binutils 2008 1007
 binutils-2.19.51.0.1 is based on CVS binutils 2009 0106

Yes, I already have good reason to suspect this patch

2008-10-22  Alan Modra  amo...@bigpond.net.au

* ldlang.c (lang_output_section_find_by_flags): Handle non-alloc
sections.
* emultempl/elf32.em (enum orphan_save_index): Add orphan_nonalloc.
(hold): Likewise.
(gld${EMULATION_NAME}_place_orphan): Handle non-alloc orphans.

causes the change in linker behaviour.  Did you try the patch I posted?

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: binutils 2.19 issue with kernel link

2009-07-10 Thread Alan Modra
On Sat, Jul 11, 2009 at 09:35:03AM +0930, Alan Modra wrote:
 Did you try the patch I posted?

/me reads other email.  I see you did.  Applying.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: binutils 2.19 issue with kernel link

2009-07-09 Thread Alan Modra
On Thu, Jul 09, 2009 at 02:31:53PM -0500, Edmar Wienskoski-RA8797 wrote:
 Kumar Gala wrote:

 On Jul 8, 2009, at 11:40 PM, Alan Modra wrote:

 On Wed, Jul 08, 2009 at 10:52:59PM -0500, Kumar Gala wrote:
 To further verify this if I switch the -me500 to -mspe and build things
 seem to be ok.  This further points at some APU section related bug.

 Like omitting .PPC.EMB.apuinfo from your kernel link script?  See the
 ld info doc on orphan sections.

 Ok, not terribly enlightening, but why would .PPC.EMB.apuinfo sections  
 be different than something like .debug sections which we also dont  
 list in the linker script.

Because .PPC.EMB.apuinfo is a note section rather than a debugging
section.  Orphan non-alloc note sections will be placed before
.comment or debug sections while orphan debug sections go right to the
end.  Now, I'll bet you don't have .comment in your script so it too
is an orphan.

 I understand your arguments, but there is something inconsistent about this.
 If I change the script to be:
_end3 = . ;
. = _end3;
. = ALIGN(PAGE_SIZE);
_end = . ;
PROVIDE32 (end = .);
 }
 The result is corrected:
 c067f678 A _end3
 c068 A _end

 Why the apuinfo section with zero VMA sometimes interfere with . and  
 sometimes not ?

That is weird.  You'll need to run ld under gdb to find out.  I'd
expect the orphan apuinfo section to be placed before the first
assignment to dot in both cases, or at the end of the script in both
cases, with placement depending on whether you hit an orphan .comment
or debug section before the orphan .PPC.EMB.apuinfo.


The underlying reason is that if you provide a link script that
doesn't mention a section, then ld is free to place that section
anywhere.  Quoting from the ld info doc:

Orphan sections are sections present in the input files which are not
explicitly placed into the output file by the linker script.  The
linker will still copy these sections into the output file, but it has
to guess as to where they should be placed.  The linker uses a simple
heuristic to do this.  It attempts to place orphan sections after
non-orphan sections of the same attribute, such as code vs data,
loadable vs non-loadable, etc.  If there is not enough room to do this
then it places at the end of the file.

For ELF targets, the attribute of the section includes section type as
well as section flag.


That's all as expected, and in your case you don't have a section with
the same attribute as .PPC.EMB.apuinfo so it should go to the end.
However, you have multiple orphan sections being added.  After the
first of these is added, you have sections after your end symbol
assignments, and when there are assignments it gets tricky.  The
relevant part of the ld info doc says:


Setting symbols to the value of the location counter outside of an
output section statement can result in unexpected values if the linker
needs to place orphan sections.  For example, given the following:

SECTIONS
{
start_of_text = . ;
.text: { *(.text) }
end_of_text = . ;

start_of_data = . ;
.data: { *(.data) }
end_of_data = . ;
}

If the linker needs to place some input section, e.g. .rodata, not
mentioned in the script, it might choose to place that section between
.text and .data.  You might think the linker should place .rodata on
the blank line in the above script, but blank lines are of no
particular significance to the linker.  As well, the linker doesn't
associate the above symbol names with their sections.  Instead, it
assumes that all assignments or other statements belong to the
previous output section, except for the special case of an assignment
to '.'.  I.e., the linker will place the orphan .rodata section as if
the script was written as follows:

SECTIONS
{
start_of_text = . ;
.text: { *(.text) }
end_of_text = . ;

start_of_data = . ;
.rodata: { *(.rodata) }
.data: { *(.data) }
end_of_data = . ;
}

This may or may not be the script author's intention for the value of
start_of_data.  One way to influence the orphan section placement is
to assign the location counter to itself, as the linker assumes that
an assignment to '.' is setting the start address of a following
output section and thus should be grouped with that section.  So you
could write:

SECTIONS
{
start_of_text = . ;
.text: { *(.text) }
end_of_text = . ;

. = . ;
start_of_data = . ;
.data: { *(.data) }
end_of_data = . ;
}

Now, the orphan .rodata section will be placed between end_of_text and
start_of_data.


Putting this all together:
a) ld places .comment or some debug section at end
b) ld places .PPC.EMB.apuinfo before the other orphan section, and
thinks your assignments to dot belong with the other orphan, so 
.PPC.EMB.apuinfo goes before them.

As no doubt you've already found, you can fix your link script by not
using . = ALIGN(PAGE_SIZE); instead use sym = ALIGN(PAGE_SIZE).

Hmm, having said all that, the following

Re: binutils 2.19 issue with kernel link

2009-07-09 Thread Alan Modra
On Thu, Jul 09, 2009 at 02:31:53PM -0500, Edmar Wienskoski-RA8797 wrote:
 I understand your arguments, but there is something inconsistent about this.
 If I change the script to be:
_end3 = . ;
. = _end3;
. = ALIGN(PAGE_SIZE);
_end = . ;
PROVIDE32 (end = .);
 }
 The result is corrected:
 c067f678 A _end3
 c068 A _end

 Why the apuinfo section with zero VMA sometimes interfere with . and  
 sometimes not ?

I said it was weird in my last email.  Not so.  The orphan gets placed
between

   _end3 = . ;
   . = _end3;

So dot is restored after the orphan section sets it.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: binutils 2.19 issue with kernel link

2009-07-08 Thread Alan Modra
On Wed, Jul 08, 2009 at 05:41:39PM -0500, Kumar Gala wrote:
 If we modify the linker script:

   _end2 = .;
   _end3 = ALIGN(4096);
   _end4 = ALIGN(PAGE_SIZE);
   . = ALIGN(PAGE_SIZE);
   _end = . ;
   PROVIDE32 (end = .);

 and the result is:

 1000 A _end
 c067f678 A _end2
 c068 A _end3
 c068 A _end4

Possibly some section with a zero vma is being placed before _end.
Generate a link map to see if this is so.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: binutils 2.19 issue with kernel link

2009-07-08 Thread Alan Modra
On Wed, Jul 08, 2009 at 10:52:59PM -0500, Kumar Gala wrote:
 To further verify this if I switch the -me500 to -mspe and build things 
 seem to be ok.  This further points at some APU section related bug.

Like omitting .PPC.EMB.apuinfo from your kernel link script?  See the
ld info doc on orphan sections.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: linux-next: build failure

2009-04-07 Thread Alan Modra
On Wed, Apr 08, 2009 at 02:04:07PM +1000, Stephen Rothwell wrote:
   LD  vmlinux.o
 powerpc-linux-ld: TOC section size exceeds 64k

I'm starting to sound like a cracked record, but I'll say it again:
ld -r does not merely package together object files, it transforms
them.   Try using thin archives instead.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [Fwd: finding fuction names]

2008-10-17 Thread Alan Modra
On Fri, Oct 17, 2008 at 01:06:29PM -0500, Steven Munroe wrote:
 Alan can you respond to this. This XLC but I suspect it is common with
 GCC and the PPC64 ABI.

Indeed, this is due to the ABI.

Content-Description: Forwarded message - finding fuction names
 To: linuxppc-dev@ozlabs.org
 Subject: finding fuction names
 From: [EMAIL PROTECTED]
 Date: Fri, 17 Oct 2008 11:07:11 -0400
 Cc: [EMAIL PROTECTED]
 
 /*
 
 Hi,
 
We have code in our product that produces stacktraces.  Part of the
 implementation of this code runs nm to find all of the entry points in 
 our libraries.
 
Using the 7.0 version of this compiler to compile the simple code below
 
 xlC_r -c -q64 tt.cxx 
  nm -C tt.o
  U __IBMCPlusPlusExceptionV1
  D test()
  T .test()
 
Using the 8.0 version of this compiler to compile the simple code below
 
 xlC_r -c -q64 tt.cxx 
 nm -C tt.o
  U __IBMCPlusPlusExceptionV2
  D test()
 
 
 We grep the nm output for text symbols which are labeled with 'T'.
 Not all function names are not labeled as text in objects compiled with 
 8.0.
 Is there another way to find all of the fuction names in a library or
 a compiler switch that will put all fuctions into the text segment ?

The PowerPC64 ABI uses function descriptors, stored in the .opd
section, a data section.  The address of a function is that of its
descriptor, which nm correctly shows as a 'D' type symbol.  A symbol
marking the start of the function code is unnecessary since you can
find that from the descriptor, so later compilers omit the dot
symbol.

nm --synthetic will look up the descriptor for you and display fake
dot symbols marking the start of each function's code.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: linux-next: powerpc build failure

2008-06-12 Thread Alan Modra
On Thu, Jun 12, 2008 at 05:05:05PM +1000, Stephen Rothwell wrote:
 /usr/bin/ld: --relax and -r may not be used together

Can't you arrange so that --relax is only used on the final link?  Of
course, if you have mashed all the input .text sections together with
ld -r, resulting in one monster .text then --relax quite likely
won't help.  ld can't break apart an input section to insert
trampolines in the middle.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: linux-next: powerpc build failure

2008-06-11 Thread Alan Modra
On Thu, Jun 12, 2008 at 11:38:19AM +1000, Paul Mackerras wrote:
 Direct unconditional branches, including procedure calls, can only
 reach +/- 32MB from the address of the branch on powerpc.  So if the
 image grows to more than 32MB of text there is a problem unless the
 linker is smart enough to insert trampolines.  I don't know whether
 GNU ld is that smart.

It is, but you need to pass --relax to enable generation of the
trampolines.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Problems with allyesconfig kernel build

2007-10-23 Thread Alan Modra
On Tue, Oct 23, 2007 at 02:02:31PM +1000, Stephen Rothwell wrote:
 Anyone have any ideas?

The segfault with --emit-relocs and complaints about .fixup are linker
bugs.  I'm about the commit fixes for both of these problems.

-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev