Re: [Development] how to reduce the relocation -- Use static qt libraries
Hi, I want to share some result about the relocation during the loading (with RTLD_LAZY). Relocation count for single so (libqt5) + without optimization: R_ARM_GLOB_DAT: 1585 R_ARM_RELATIVE: 9823 R_ARM_ABS32: 19489 R_ARM_JUMP_SLOT: 16998 Relocation count for single so (libqt5) + with optimization: R_ARM_GLOB_DAT: 1578 R_ARM_RELATIVE: 28227 R_ARM_ABS32: 435 R_ARM_JUMP_SLOT: 290 And the optimization done here is only about changing the visibility of exported symbols from default to protected, thanks Thiago's blog ;). So: - the R_ARM_JUMP_SLOT relocation is reduced significantly, but which is only happened at run time (as RTLD_LAZY), so it's irrelevant with the loading performance. - the R_ARM_RELATIVE relocation is increase but this type relocation is very fast. - actually for loading time, the bottleneck is the R_ARM_ABS32 relocation, which is reduced around 97% now ! Finally the overall loading time is reduced from ~10-20s to ~1s... But I still have some question about the R_ARM_ABS32 relocation. It seems if the function is virtual (with default visibility), then it will be added into .rel.dyn as the R_ARM_ABS32 type, for example: 007b0124 0011a802 R_ARM_ABS3200311e4b _ZN20QEventDispatcherUNIX13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE Could someone help with below: 1. why the virtual function with default visibility needs relocation even if it's implemented inside ? 2. when changed to protected visibility, I guess it's optimized to add a GOT.PLT entry as a R_ARM_RELATIVE relocation, is that true ? Thanks, Song -Original Message- From: development-bounces+song.7.liu=nokia@qt-project.org [mailto:development-bounces+song.7.liu=nokia@qt-project.org] On Behalf Of ext Thiago Macieira Sent: Tuesday, July 24, 2012 10:29 PM To: development@qt-project.org Subject: Re: [Development] how to reduce the relocation -- Use static qt libraries On terça-feira, 24 de julho de 2012 13.22.25, song.7@nokia.com wrote: Yes, the bottleneck of the loading now is the local relocations instead of inter-library's. So what we want to do will be reducing the number of local relocation. Based on my understanding, this local relocation should be caused by the symbol inter-positioning. That's not exactly the case. Some types of relocations do permit symbol interpositioning. But some types of code require relocations even if they're not interposable. In my listing, all the local relocations are non-interposable. More information: http://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on- linux/ http://www.macieira.org/blog/2012/01/update-and-benchmark-on-the-dynamic- library-proposals/ And from gcc option -Bsymbolic: When creating a shared library, bind references to global symbols to the definition within the shared library, if any. Normally, it is possible for a program linked against a shared library to override the definition within the shared library. This option is only meaningful on ELF platforms which support shared libraries. But for my case, it's not needed to override the definition within the libqt5.so. Yes, it is. But you didn't realise that your code requires relocations even if the symbols can't be overridden. In order to do that, you need a fully position *dependent* code that can't be moved. Executables on Linux are like that, but all libraries are movable in memory, even those compiled without -fPIC. Since you're not running Linux, check if your OS supports that. Note that you'll need to know the exact load address at build time and that it must match the loaded address for the ROM if you want to do XIP. So, besides the prelink solution, I think the compiler (I mean armlink) should provide the ability to disable this symbol inter-positioning, just like the -Bsymbolic in gcc. Does anyone have idea from the compiler point of view ? Sorry, you're barking up the wrong tree. Your only option to reduce the number of relocations is to prelink to the exact load address. There are two ways of doing that: 1) the ELF prelinker, which prelinks all relocations to a given address, but does still allow relocating if the shared object is loaded at a different address. The code is PIC, so XIP should work just fine. 2) compile without PIC and prelink at a specific address at link time, which means that the code must be loaded there or it will fail to run. This is the Windows DLL model. Also I see that Qt also uses the -Bsymbolic-functions to do some optimization, is that similar case to reduce the relocation ? Yes. Read my blogs for more detail. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center Intel Sweden AB - Registration Number: 556189-6027 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden ___ Development mailing list
Re: [Development] how to reduce the relocation -- Use static qt libraries
Probably, I know that the R_ARM_ABS32 is about *reference* the address of a function. For C++ virtual class, then the virtual table will not know the actual address of the virtual functions, which is with _default_ visibility. So a R_ARM_ABS32 relocation is needed. After changed with _protected_ visibility, that kind of relocation is reduced, but I still don't know why more R_ARM_RELATIVE relocation introduced. Anything wrong please correct me ;-) Thanks, Song -Original Message- From: development-bounces+song.7.liu=nokia@qt-project.org [mailto:development-bounces+song.7.liu=nokia@qt-project.org] On Behalf Of Liu Song.7 (Nokia-MP/Beijing) Sent: Sunday, July 29, 2012 4:13 PM To: thiago.macie...@intel.com; development@qt-project.org Subject: Re: [Development] how to reduce the relocation -- Use static qt libraries Hi, I want to share some result about the relocation during the loading (with RTLD_LAZY). Relocation count for single so (libqt5) + without optimization: R_ARM_GLOB_DAT: 1585 R_ARM_RELATIVE: 9823 R_ARM_ABS32: 19489 R_ARM_JUMP_SLOT: 16998 Relocation count for single so (libqt5) + with optimization: R_ARM_GLOB_DAT: 1578 R_ARM_RELATIVE: 28227 R_ARM_ABS32: 435 R_ARM_JUMP_SLOT: 290 And the optimization done here is only about changing the visibility of exported symbols from default to protected, thanks Thiago's blog ;). So: - the R_ARM_JUMP_SLOT relocation is reduced significantly, but which is only happened at run time (as RTLD_LAZY), so it's irrelevant with the loading performance. - the R_ARM_RELATIVE relocation is increase but this type relocation is very fast. - actually for loading time, the bottleneck is the R_ARM_ABS32 relocation, which is reduced around 97% now ! Finally the overall loading time is reduced from ~10-20s to ~1s... But I still have some question about the R_ARM_ABS32 relocation. It seems if the function is virtual (with default visibility), then it will be added into .rel.dyn as the R_ARM_ABS32 type, for example: 007b0124 0011a802 R_ARM_ABS3200311e4b _ZN20QEventDispatcherUNIX13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE Could someone help with below: 1. why the virtual function with default visibility needs relocation even if it's implemented inside ? 2. when changed to protected visibility, I guess it's optimized to add a GOT.PLT entry as a R_ARM_RELATIVE relocation, is that true ? Thanks, Song -Original Message- From: development-bounces+song.7.liu=nokia@qt-project.org [mailto:development-bounces+song.7.liu=nokia@qt-project.org] On Behalf Of ext Thiago Macieira Sent: Tuesday, July 24, 2012 10:29 PM To: development@qt-project.org Subject: Re: [Development] how to reduce the relocation -- Use static qt libraries On terça-feira, 24 de julho de 2012 13.22.25, song.7@nokia.com wrote: Yes, the bottleneck of the loading now is the local relocations instead of inter-library's. So what we want to do will be reducing the number of local relocation. Based on my understanding, this local relocation should be caused by the symbol inter-positioning. That's not exactly the case. Some types of relocations do permit symbol interpositioning. But some types of code require relocations even if they're not interposable. In my listing, all the local relocations are non-interposable. More information: http://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on- linux/ http://www.macieira.org/blog/2012/01/update-and-benchmark-on-the-dynamic- library-proposals/ And from gcc option -Bsymbolic: When creating a shared library, bind references to global symbols to the definition within the shared library, if any. Normally, it is possible for a program linked against a shared library to override the definition within the shared library. This option is only meaningful on ELF platforms which support shared libraries. But for my case, it's not needed to override the definition within the libqt5.so. Yes, it is. But you didn't realise that your code requires relocations even if the symbols can't be overridden. In order to do that, you need a fully position *dependent* code that can't be moved. Executables on Linux are like that, but all libraries are movable in memory, even those compiled without -fPIC. Since you're not running Linux, check if your OS supports that. Note that you'll need to know the exact load address at build time and that it must match the loaded address for the ROM if you want to do XIP. So, besides the prelink solution, I think the compiler (I mean armlink) should provide the ability to disable this symbol inter-positioning, just like the -Bsymbolic in gcc. Does anyone have idea from the compiler point of view ? Sorry, you're barking up the wrong tree. Your only option to reduce the number of relocations is to prelink to the exact load address. There are two ways of doing that: 1) the ELF
Re: [Development] how to reduce the relocation -- Use static qt libraries
After changed with _protected_ visibility, that kind of relocation is reduced, but I still don't know why more R_ARM_RELATIVE relocation introduced. Answer my own question, that is because the loading address of the module needs to be added to know actual address of each virtual functions. So for the qt(5), should we change all the exported symbol 's visibility to _protected_ ? Or is there still some exited use case to use _default_ visibility ? Thanks, Song -Original Message- From: development-bounces+song.7.liu=nokia@qt-project.org [mailto:development-bounces+song.7.liu=nokia@qt-project.org] On Behalf Of Liu Song.7 (Nokia-MP/Beijing) Sent: Sunday, July 29, 2012 6:02 PM To: thiago.macie...@intel.com; development@qt-project.org Subject: Re: [Development] how to reduce the relocation -- Use static qt libraries Probably, I know that the R_ARM_ABS32 is about *reference* the address of a function. For C++ virtual class, then the virtual table will not know the actual address of the virtual functions, which is with _default_ visibility. So a R_ARM_ABS32 relocation is needed. After changed with _protected_ visibility, that kind of relocation is reduced, but I still don't know why more R_ARM_RELATIVE relocation introduced. Anything wrong please correct me ;-) Thanks, Song -Original Message- From: development-bounces+song.7.liu=nokia@qt-project.org [mailto:development-bounces+song.7.liu=nokia@qt-project.org] On Behalf Of Liu Song.7 (Nokia-MP/Beijing) Sent: Sunday, July 29, 2012 4:13 PM To: thiago.macie...@intel.com; development@qt-project.org Subject: Re: [Development] how to reduce the relocation -- Use static qt libraries Hi, I want to share some result about the relocation during the loading (with RTLD_LAZY). Relocation count for single so (libqt5) + without optimization: R_ARM_GLOB_DAT: 1585 R_ARM_RELATIVE: 9823 R_ARM_ABS32: 19489 R_ARM_JUMP_SLOT: 16998 Relocation count for single so (libqt5) + with optimization: R_ARM_GLOB_DAT: 1578 R_ARM_RELATIVE: 28227 R_ARM_ABS32: 435 R_ARM_JUMP_SLOT: 290 And the optimization done here is only about changing the visibility of exported symbols from default to protected, thanks Thiago's blog ;). So: - the R_ARM_JUMP_SLOT relocation is reduced significantly, but which is only happened at run time (as RTLD_LAZY), so it's irrelevant with the loading performance. - the R_ARM_RELATIVE relocation is increase but this type relocation is very fast. - actually for loading time, the bottleneck is the R_ARM_ABS32 relocation, which is reduced around 97% now ! Finally the overall loading time is reduced from ~10-20s to ~1s... But I still have some question about the R_ARM_ABS32 relocation. It seems if the function is virtual (with default visibility), then it will be added into .rel.dyn as the R_ARM_ABS32 type, for example: 007b0124 0011a802 R_ARM_ABS3200311e4b _ZN20QEventDispatcherUNIX13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE Could someone help with below: 1. why the virtual function with default visibility needs relocation even if it's implemented inside ? 2. when changed to protected visibility, I guess it's optimized to add a GOT.PLT entry as a R_ARM_RELATIVE relocation, is that true ? Thanks, Song -Original Message- From: development-bounces+song.7.liu=nokia@qt-project.org [mailto:development-bounces+song.7.liu=nokia@qt-project.org] On Behalf Of ext Thiago Macieira Sent: Tuesday, July 24, 2012 10:29 PM To: development@qt-project.org Subject: Re: [Development] how to reduce the relocation -- Use static qt libraries On terça-feira, 24 de julho de 2012 13.22.25, song.7@nokia.com wrote: Yes, the bottleneck of the loading now is the local relocations instead of inter-library's. So what we want to do will be reducing the number of local relocation. Based on my understanding, this local relocation should be caused by the symbol inter-positioning. That's not exactly the case. Some types of relocations do permit symbol interpositioning. But some types of code require relocations even if they're not interposable. In my listing, all the local relocations are non-interposable. More information: http://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on- linux/ http://www.macieira.org/blog/2012/01/update-and-benchmark-on-the-dynamic- library-proposals/ And from gcc option -Bsymbolic: When creating a shared library, bind references to global symbols to the definition within the shared library, if any. Normally, it is possible for a program linked against a shared library to override the definition within the shared library. This option is only meaningful on ELF platforms which support shared libraries. But for my case, it's not needed to override the definition within the libqt5.so. Yes, it is. But you didn't realise that your code requires relocations even if the symbols
Re: [Development] QTextBoundaryFinder behavior change in Qt-5.0
On Saturday 28 July 2012 14:34:25 Konstantin Ritt wrote: I have a patch that changes QTBF's behavior so that . won't be treated like a word at all. This patch hardly depends on some other patches that are in review stage, though. The hardly and the though seem contradictory to me. Did you mean This patch has a hard dependency on, or This patch doesn't really depend on (== hardly depends)? I would appreciate if you extend the QTBF autotests with some of Sonnet's testcases. https://codereview.qt-project.org/#change,31717 -- David Faure, fa...@kde.org, http://www.davidfaure.fr Sponsored by Nokia to work on KDE, incl. KDE Frameworks 5 ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] QTextBoundaryFinder behavior change in Qt-5.0
This patch has a hard dependency on ... ;) Konstantin 2012/7/29 David Faure fa...@kde.org: On Saturday 28 July 2012 14:34:25 Konstantin Ritt wrote: I have a patch that changes QTBF's behavior so that . won't be treated like a word at all. This patch hardly depends on some other patches that are in review stage, though. The hardly and the though seem contradictory to me. Did you mean This patch has a hard dependency on, or This patch doesn't really depend on (== hardly depends)? I would appreciate if you extend the QTBF autotests with some of Sonnet's testcases. https://codereview.qt-project.org/#change,31717 -- David Faure, fa...@kde.org, http://www.davidfaure.fr Sponsored by Nokia to work on KDE, incl. KDE Frameworks 5 ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] how to reduce the relocation -- Use static qt libraries
On domingo, 29 de julho de 2012 08.13.20, song.7@nokia.com wrote: - actually for loading time, the bottleneck is the R_ARM_ABS32 relocation, which is reduced around 97% now ! Finally the overall loading time is reduced from ~10-20s to ~1s... Wow! Any chance you can blog about this somewhere? If you don't, I will based on your data. But I still have some question about the R_ARM_ABS32 relocation. It seems if the function is virtual (with default visibility), then it will be added into .rel.dyn as the R_ARM_ABS32 type, for example: 007b0124 0011a802 R_ARM_ABS3200311e4b _ZN20QEventDispatcherUNIX13processEventsE6QFlagsIN10QEventLoop17ProcessEven tsFlagEE Could someone help with below: 1. why the virtual function with default visibility needs relocation even if it's implemented inside ? That's what default means: the symbol can be interposed by another, from a different library. It is possible that another library implements a different version of this function. If it weren't like that, if it resolved to the local library, it would be behaviour of protected visibility. That is: 2. when changed to protected visibility, I guess it's optimized to add a GOT.PLT entry as a R_ARM_RELATIVE relocation, is that true ? The virtual table gets a relative relocation, true. Virtual calls don't need PLT entries since they're always indirect. The PLT for a virtual function only shows up if that function is called non-virtually: that is, from the constructor, when the full class name was specified or when the compiler could prove what class the object is. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center Intel Sweden AB - Registration Number: 556189-6027 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden signature.asc Description: This is a digitally signed message part. ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
[Development] Fwd:Proposal - QtSerialPort graduation from the Playground
Пересылаемое сообщение Lars, Thank you very much for your positive feedback. We are very pleased and look forward to the beta version Qt5, to continue the constructive development of QtSerialPort as addon.. Best regards, Denis -- ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] how to reduce the relocation -- Use static qt libraries
Wow! Any chance you can blog about this somewhere? If you don't, I will based on your data. No, I don't have a public blog. So please go ahead, thanks ! -Original Message- From: development-bounces+song.7.liu=nokia@qt-project.org [mailto:development-bounces+song.7.liu=nokia@qt-project.org] On Behalf Of ext Thiago Macieira Sent: Sunday, July 29, 2012 9:10 PM To: development@qt-project.org Subject: Re: [Development] how to reduce the relocation -- Use static qt libraries On domingo, 29 de julho de 2012 08.13.20, song.7@nokia.com wrote: - actually for loading time, the bottleneck is the R_ARM_ABS32 relocation, which is reduced around 97% now ! Finally the overall loading time is reduced from ~10-20s to ~1s... Wow! Any chance you can blog about this somewhere? If you don't, I will based on your data. But I still have some question about the R_ARM_ABS32 relocation. It seems if the function is virtual (with default visibility), then it will be added into .rel.dyn as the R_ARM_ABS32 type, for example: 007b0124 0011a802 R_ARM_ABS3200311e4b _ZN20QEventDispatcherUNIX13processEventsE6QFlagsIN10QEventLoop17Proces sEven tsFlagEE Could someone help with below: 1. why the virtual function with default visibility needs relocation even if it's implemented inside ? That's what default means: the symbol can be interposed by another, from a different library. It is possible that another library implements a different version of this function. If it weren't like that, if it resolved to the local library, it would be behaviour of protected visibility. That is: 2. when changed to protected visibility, I guess it's optimized to add a GOT.PLT entry as a R_ARM_RELATIVE relocation, is that true ? The virtual table gets a relative relocation, true. Virtual calls don't need PLT entries since they're always indirect. The PLT for a virtual function only shows up if that function is called non-virtually: that is, from the constructor, when the full class name was specified or when the compiler could prove what class the object is. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center Intel Sweden AB - Registration Number: 556189-6027 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] QFileSystemWatcher and Recursive Monitoring
BH Alright I'll begin trying to implement some of this, possibly inside the QFileSystemWatcher class itself. But don't expect much, this is opensauce after all so I might come back in a few months and yawn Oh right, QFileSystemWatcher... forgot about that :D Though if some volunteers can help implement the Linux/X11 inotify and MacOSX FSEvents/Kqueue parts, that would help motivate a little (and get everything coded sooner) since I'm mainly familiar with the Windows ReadDirectoryChangesW end. Cheers -regedit ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development