> After changed with _protected_ visibility, that kind of relocation is > reduced, but I still don't know why more R_ARM_RELATIVE relocation introduced.
Answer my own question, that is because the loading address of the module needs to be added to know actual address of each virtual functions. So for the qt(5), should we change all the exported symbol 's visibility to _protected_ ? Or is there still some exited use case to use _default_ visibility ? Thanks, Song -----Original Message----- From: development-bounces+song.7.liu=nokia....@qt-project.org [mailto:development-bounces+song.7.liu=nokia....@qt-project.org] On Behalf Of Liu Song.7 (Nokia-MP/Beijing) Sent: Sunday, July 29, 2012 6:02 PM To: thiago.macie...@intel.com; development@qt-project.org Subject: Re: [Development] how to reduce the relocation <-- Use static qt libraries Probably, I know that the R_ARM_ABS32 is about *reference* the address of a function. For C++ virtual class, then the virtual table will not know the actual address of the virtual functions, which is with _default_ visibility. So a R_ARM_ABS32 relocation is needed. After changed with _protected_ visibility, that kind of relocation is reduced, but I still don't know why more R_ARM_RELATIVE relocation introduced. Anything wrong please correct me ;-) Thanks, Song -----Original Message----- From: development-bounces+song.7.liu=nokia....@qt-project.org [mailto:development-bounces+song.7.liu=nokia....@qt-project.org] On Behalf Of Liu Song.7 (Nokia-MP/Beijing) Sent: Sunday, July 29, 2012 4:13 PM To: thiago.macie...@intel.com; development@qt-project.org Subject: Re: [Development] how to reduce the relocation <-- Use static qt libraries Hi, I want to share some result about the relocation during the loading (with RTLD_LAZY). Relocation count for single so (libqt5) + without optimization: R_ARM_GLOB_DAT: 1585 R_ARM_RELATIVE: 9823 R_ARM_ABS32: 19489 R_ARM_JUMP_SLOT: 16998 Relocation count for single so (libqt5) + with optimization: R_ARM_GLOB_DAT: 1578 R_ARM_RELATIVE: 28227 R_ARM_ABS32: 435 R_ARM_JUMP_SLOT: 290 And the optimization done here is only about changing the visibility of exported symbols from "default" to "protected", thanks Thiago's blog ;). So: - the R_ARM_JUMP_SLOT relocation is reduced significantly, but which is only happened at run time (as RTLD_LAZY), so it's irrelevant with the loading performance. - the R_ARM_RELATIVE relocation is increase but this type relocation is very fast. - actually for loading time, the bottleneck is the R_ARM_ABS32 relocation, which is reduced around 97% now ! Finally the overall loading time is reduced from ~10-20s to ~1s... But I still have some question about the R_ARM_ABS32 relocation. It seems if the function is virtual (with "default" visibility), then it will be added into .rel.dyn as the R_ARM_ABS32 type, for example: 007b0124 0011a802 R_ARM_ABS32 00311e4b _ZN20QEventDispatcherUNIX13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE Could someone help with below: 1. why the virtual function with "default" visibility needs relocation even if it's implemented inside ? 2. when changed to "protected" visibility, I guess it's optimized to add a GOT.PLT entry as a R_ARM_RELATIVE relocation, is that true ? Thanks, Song -----Original Message----- From: development-bounces+song.7.liu=nokia....@qt-project.org [mailto:development-bounces+song.7.liu=nokia....@qt-project.org] On Behalf Of ext Thiago Macieira Sent: Tuesday, July 24, 2012 10:29 PM To: development@qt-project.org Subject: Re: [Development] how to reduce the relocation <-- Use static qt libraries On terça-feira, 24 de julho de 2012 13.22.25, song.7....@nokia.com wrote: > Yes, the bottleneck of the loading now is the local relocations > instead of inter-library's. > > So what we want to do will be reducing the number of local relocation. > > Based on my understanding, this local relocation should be caused by > the "symbol inter-positioning". That's not exactly the case. Some types of relocations do permit symbol interpositioning. But some types of code require relocations even if they're not interposable. In my listing, all the "local" relocations are non-interposable. More information: http://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on- linux/ http://www.macieira.org/blog/2012/01/update-and-benchmark-on-the-dynamic- library-proposals/ > And from gcc option -Bsymbolic: > " > When creating a shared library, bind references to global symbols to > the definition within the shared library, if any. Normally, it is > possible for a program linked against a shared library to override the > definition within the shared library. This option is only meaningful > on ELF platforms which support shared libraries. " > > But for my case, it's not needed to override the definition within the > libqt5.so. Yes, it is. But you didn't realise that your code requires relocations even if the symbols can't be overridden. In order to do that, you need a fully position *dependent* code that can't be moved. Executables on Linux are like that, but all libraries are movable in memory, even those compiled without -fPIC. Since you're not running Linux, check if your OS supports that. Note that you'll need to know the exact load address at build time and that it must match the loaded address for the ROM if you want to do XIP. > So, besides the prelink solution, I think the compiler (I mean > armlink) should provide the ability to disable this symbol > inter-positioning, just like the -Bsymbolic in gcc. > > Does anyone have idea from the compiler point of view ? Sorry, you're barking up the wrong tree. Your only option to reduce the number of relocations is to prelink to the exact load address. There are two ways of doing that: 1) the ELF prelinker, which prelinks all relocations to a given address, but does still allow relocating if the shared object is loaded at a different address. The code is PIC, so XIP should work just fine. 2) compile without PIC and prelink at a specific address at link time, which means that the code must be loaded there or it will fail to run. This is the Windows DLL model. > > Also I see that Qt also uses the "-Bsymbolic-functions" to do some > optimization, is that similar case to reduce the relocation ? Yes. Read my blogs for more detail. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center Intel Sweden AB - Registration Number: 556189-6027 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden _______________________________________________ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development _______________________________________________ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development _______________________________________________ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development