Hi Daniel, On Monday, 19 June 2017 12:48:12 PDT Daniel Schwierzeck wrote: > Am 19.06.2017 um 20:53 schrieb Paul Burton: > > Hi Daniel, > > > > On Friday, 16 June 2017 15:48:06 PDT Daniel Schwierzeck wrote: > >> Am 16.06.2017 um 02:05 schrieb Paul Burton: > >>> U-Boot has up until now built with -fpic for the MIPS architecture, > >>> producing position independent code which uses indirection through a > >>> global offset table, making relocation fairly straightforward as it > >>> simply involves patching up GOT entries. > >>> > >>> Using -fpic does however have some downsides. The biggest of these is > >>> that generated code is bloated in various ways. For example, function > >>> > >>> calls are indirected through the GOT & the t9 register: > >>> 8f998064 lw t9,-32668(gp) > >>> 0320f809 jalr t9 > >>> > >>> Without -fpic the call is simply: > >>> 0f803f01 jal be00fc04 <puts> > >>> > >>> This is more compact & faster (due to the lack of the load & the > >>> dependency the jump has on its result). It is also easier to read & > >>> debug because the disassembly shows what function is being called, > >>> rather than just an offset from gp which would then have to be looked up > >>> in the ELF to discover the target function. > >>> > >>> Another disadvantage of -fpic is that each function begins with a > >>> > >>> sequence to calculate the value of the gp register, for example: > >>> 3c1c0004 lui gp,0x4 > >>> 279c3384 addiu gp,gp,13188 > >>> 0399e021 addu gp,gp,t9 > >>> > >>> Without using -fpic this sequence no longer appears at the start of each > >>> function, reducing code size considerably. > >>> > >>> This patch switches U-Boot from building with -fpic to building with > >>> -fno-pic, in order to gain the benefits described above. The cost of > >>> this is an extra step during the build process to extract relocation > >>> data from the ELF & write it into a new .rel section in a compact > >>> format, plus the added complexity of dealing with multiple types of > >>> relocation rather than the single type that applied to the GOT. The > >>> benefit is smaller, cleaner, more debuggable code. The relocate_code() > >>> function is reimplemented in C to handle the new relocation scheme, > >>> which also makes it easier to read & debug. > >>> > >>> Taking maltael_defconfig as an example the size of u-boot.bin built > >>> using the Codescape MIPS 2016.05-06 toolchain (gcc 4.9.2, binutils > >>> 2.24.90) shrinks from 254KiB to 224KiB. > >>> > >>> Signed-off-by: Paul Burton <paul.bur...@imgtec.com> > >>> Cc: Daniel Schwierzeck <daniel.schwierz...@gmail.com> > >>> Cc: u-boot@lists.denx.de > >>> --- > >>> > >>> arch/mips/Makefile.postlink | 23 +++ > >>> arch/mips/config.mk | 19 +- > >>> arch/mips/cpu/start.S | 130 ------------- > >>> arch/mips/cpu/u-boot.lds | 41 +--- > >>> arch/mips/include/asm/relocs.h | 24 +++ > >>> arch/mips/lib/Makefile | 1 + > >>> arch/mips/lib/reloc.c | 164 ++++++++++++++++ > >>> common/board_f.c | 2 +- > >>> tools/.gitignore | 1 + > >>> tools/Makefile | 2 + > >>> tools/mips-relocs.c | 426 > >>> +++++++++++++++++++++++++++++++++++++++++ 11 files changed, 656 > >>> insertions(+), 177 deletions(-) > >>> create mode 100644 arch/mips/Makefile.postlink > >>> create mode 100644 arch/mips/include/asm/relocs.h > >>> create mode 100644 arch/mips/lib/reloc.c > >>> create mode 100644 tools/mips-relocs.c > >> > >> there is a regression on qemu_mips when started with Qemu. The code > >> execution hangs in an endless loop and doesn't reach the console prompt. > >> > >> I could debug it to following location: > >> > >> int bootm_find_images(int flag, int argc, char * const argv[]) > >> { > >> > >> int ret; > >> > >> /* find ramdisk */ > >> ret = boot_get_ramdisk(argc, argv, &images, IH_INITRD_ARCH, > >> > >> &images.rd_start, &images.rd_end); > >> > >> if (ret) { > >> > >> puts("Ramdisk image is corrupt or invalid\n"); > >> return 1; > >> > >> } > >> > >> ... > >> > >> } > >> > >> The code flow goes into the "if (ret)" branch. At the "return 1" the $ra > >> register contains the address of bootm_find_images(). Thus the code is > >> executed in an endless loop. I don't know yet if that's a miscalculated > >> relocation or a stack overflow (maybe due to the changed 64KiB alignment > >> of the U-Boot relocation address) because $ra in bootm_find_images() is > >> loaded from stack: > >> > >> bfc08f1c <bootm_find_images>: > >> bfc08f1c: 3c02bfc3 lui v0,0xbfc3 > >> bfc08f20: 27bdffe0 addiu sp,sp,-32 > >> ... > >> bfc08f6c: 8fbf001c lw ra,28(sp) > >> bfc08f70: 00601025 move v0,v1 > >> bfc08f74: 03e00008 jr ra > >> bfc08f78: 27bd0020 addiu sp,sp,32 > >> > >> > >> This regression leads to broken pytest for qemu_mips and therefore to > >> failing Travis CI [1]. I used the kernel.org MIPS toolchain with gcc-4.9 > >> (same as in Travis CI). Could you please have a look? > >> > >> > >> [1] https://travis-ci.org/danielschwierzeck/u-boot/jobs/243667863 > > > > This was due to an R_MIPS_26 relocation applied to a j instruction in > > show_board_info, which overflowed into the opcode field & changed the j > > into a jal. > > > > v3 should fix it. > > it works now, thanks for fixing. > > Are you aware of any changes regarding MIPS relocation in gcc-6.x and > recent binutils or MIPS R6 which could cause other regressions? I still > need to extend my Qemu test scripts with Boston and MIPS R6 and to > get/build a gcc-6.x toolchain. Thus I can't verify this at the moment ;)
I'm not aware of any, though I haven't tested with gcc 6 either. MIPSr6 introduces a bunch of new PC-relative relocations, but since they're PC-relative we don't need to care about them & the mips-relocs tool simply ignores them. I did test both r2 & r6 Malta builds, along with varying combinations of big & little endian, 32 & 64 bit. Thanks, Paul
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot