Graeme Russ <graeme.r...@gmail.com> wrote on 10/10/2009 13:21:10: > > On Sat, Oct 10, 2009 at 9:47 PM, Joakim Tjernlund > <joakim.tjernl...@transmode.se> wrote: > > > > > > Graeme Russ <graeme.r...@gmail.com> wrote on 10/10/2009 12:38:19: > >> > >> On Sat, Oct 10, 2009 at 8:27 PM, Joakim Tjernlund > >> <joakim.tjernl...@transmode.se> wrote: > >> > > >> > > >> > Graeme Russ <graeme.r...@gmail.com> wrote on 10/10/2009 10:46:52: > >> >> > >> >> On Sat, Oct 10, 2009 at 7:07 PM, Joakim Tjernlund > >> >> <joakim.tjernl...@transmode.se> wrote: > >> >> > Graeme Russ <graeme.r...@gmail.com> wrote on 10/10/2009 06:43:52: > >> >> >> > >> >> >> On Fri, Oct 9, 2009 at 10:12 AM, Joakim Tjernlund > >> >> >> <joakim.tjernl...@transmode.se> wrote: > >> >> >> >> > >> >> >> >> On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell > >> >> >> >> <jwilliamcampb...@comcast.net> wrote: > >> >> >> >> > Graeme Russ wrote: > >> >> >> >> >> > >> >> >> >> >> On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell > >> >> >> >> >> <jwilliamcampb...@comcast.net> wrote: > >> >> >> >> >> > >> >> >> >> >>> > >> >> >> >> >>> Graeme Russ wrote: > >> >> >> >> >>> > >> >> >> >> >>>> > >> >> >> >> >>>> Out of curiosity, I wanted to see just how much of a size > >> >> >> >> >>>> penalty I am > >> >> >> >> >>>> incurring by using gcc -fpic / ld -pic on my x86 u-boot > >> >> >> >> >>>> build. Here are > >> >> >> >> >>>> the results (fixed width font will help - its space, not tab, > >> >> >> >> >>>> formatted): > >> >> >> >> >>>> > >> >> >> >> >>>> Section non-reloc reloc > >> >> >> >> >>>> --------------------------------------- > >> >> >> >> >>>> .text 000118c4 000137fc <- 0x1f38 bytes > >> >> >> >> >>>> (~8kB) bigger > >> >> >> >> >>>> .rodata 00005bad 000059d0 > >> >> >> >> >>>> .interp n/a 00000013 > >> >> >> >> >>>> .dynstr n/a 00000648 > >> >> >> >> >>>> .hash n/a 00000428 > >> >> >> >> >>>> .eh_frame 00003268 000034fc > >> >> >> >> >>>> .data 00000a6c 000001dc > >> >> >> >> >>>> .data.rel n/a 00000098 > >> >> >> >> >>>> .data.rel.ro.local n/a 00000178 > >> >> >> >> >>>> .data.rel.local n/a 000007e4 > >> >> >> >> >>>> .got 00000000 000001f0 > >> >> >> >> >>>> .got.plt n/a 0000000c > >> >> >> >> >>>> .rel.got n/a 000003e0 > >> >> >> >> >>>> .rel.dyn n/a 00001228 > >> >> >> >> >>>> .dynsym n/a 00000850 > >> >> >> >> >>>> .dynamic n/a 00000080 > >> >> >> >> >>>> .u_boot_cmd 000003c0 000003c0 > >> >> >> >> >>>> .bss 00001a34 00001a34 > >> >> >> >> >>>> .realmode 00000166 00000166 > >> >> >> >> >>>> .bios 0000053e 0000053e > >> >> >> >> >>>> ======================================= > >> >> >> >> >>>> Total 0001d5dd 00022287 <- 0x4caa bytes > >> >> >> >> >>>> (~19kB) bigger > >> >> >> >> >>>> > >> >> >> >> >>>> Its more than a 16% increase in size!!! > >> >> >> >> >>>> > >> >> >> >> >>>> .text accounts for a little under half of the total bloat, > >> >> >> >> >>>> and of that, > >> >> >> >> >>>> the crude dynamic loader accounts for only 341 bytes > >> >> >> >> >>>> > >> >> >> >> >>>> > >> >> >> >> >>> > >> >> >> >> >>> Hi Graeme, > >> >> >> >> >>> I would be interested in a third option (column), the x86 > >> >> >> >> >>> build with > >> >> >> >> >>> just -mrelocateable but NOT -fpic. It will not be definitive > >> >> >> >> >>> because > >> >> >> >> >>> there > >> >> >> >> >>> will be extra code that references the GOT and missing code > >> >> >> >> >>> todo some of > >> >> >> >> >>> the relocation, but it would still be interesting. > >> >> >> >> >>> > >> >> >> >> >> > >> >> >> >> >> x86 does not have -mrelocatable. This is a PPC only option :( > >> >> >> >> >> > >> >> >> >> > > >> >> >> >> > Hi Graeme, > >> >> >> >> > You are unfortunately correct. However, I wonder if we > >> >> >> >> > can get > >> >> >> >> > essentially the same result by executing the final ld step with > >> >> >> >> > the > >> >> >> >> > --emit-relocs switch included. This may also include some > >> >> >> >> > "extra" sections > >> >> >> >> > that we would want to strip out, but if it works, it could give > >> >> >> >> > all > >> >> >> >> > ELF-based systems a way to a relocatable u-boot. > >> >> >> >> > > >> >> >> >> > >> >> >> >> I don't think --emit-relocs is necessary with -pic. I haven't > >> >> >> >> gone through > >> >> >> >> all the permutations to see if there is a smaller option, but gcc > >> >> >> >> -fpic and > >> >> >> >> ld -pie creates enough information to perform relocation on the > >> >> >> >> x86 > >> >> >> >> platform > >> >> >> > > >> >> >> > Try -fvisibility=hidden > >> >> >> > >> >> >> Thanks - Shaved another 2539 bytes off the binary > >> >> >> > >> >> >> Also found out how to get rid of .eh_frame (crept in when I upgraded > >> >> >> to > >> >> >> gcc 4.4.1) with -fno-dwarf2-cfi-asm, so that shaves another 13452 > >> >> >> bytes > >> >> >> > >> >> >> Total saving of 15.6k > >> >> > > >> >> > Great, so now you are back at just a few percent added I guess? > >> >> > > >> >> > > >> >> > >> >> Not really - The .eh_frame saving applies to both relocated and non > >> >> relocated builds > >> > > >> > OK, so you didn't use PIC before at all? > >> > > >> > Anyway I think you can do more. Using -Bsymbolic you should get > >> > away with RELATIVE relocs only and be able to skip a lot of segments > >> > above. > >> > Have a look at uClibc ldso/ldso/dl-startup.c > >> > > >> > > >> > >> My build options thus far are: > >> > >> PLATFORM_RELFLAGS += -fpie -fvisibility=hidden > >> PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm > >> PLATFORM_LDFLAGS += -pie > >> > >> -fpic / -pic make no difference > > > > not on x86, on ppc it is a big difference. > > > >> > >> Interestingly, -Bsymbolic adds exactly 8 bytes to .dynamic, but doesn't > >> change the size of any other section > >> > >> Pulling apart the relocation sections, it seems that all relocations are > >> already RELATIVE even without -Bsymbolic > > > > Ah, that is because you built an exe with -pie > > Then you should be able to drop everything but the RELATIVE > > from the linking, or almost in any case. > > > > Jocke > > > > > > Hmm, so its seems I may have hit the limit. I tried: > > PLATFORM_LDFLAGS += -r --emit-relocs > > but there is not enough information left to complete the relocation. It > seems as though I need .rel.got, .got.plt, .dynsym and .rel.dyn in order > to find the actual bytes that need modifying (it also seems to mess with > the size of the stripped binary for some reason) > > Looks like I'll have to proceed with my original plan - a bit bloated, > but it works
Relocation costs :( I am not sure why you need .got.plt, it should be empty, what is in it? Same with dynsym, what is in it? Memory fails me, but since u-boot is a freestanding app it I think these two might not be needed. Perhaps there are weak unresolved syms in there? Jocke _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot