(Hit send too soon on my last mail and appear to have removed linaro-toolchain Apologies to those who get duplicates)
On Tue, Mar 06, 2012 at 04:00:36PM +0000, Andrew Stubbs wrote: > Hi Alexandros, > > Could you use the linaro-toolchain list for stuff like this please? > You're more likely to find somebody who knows the answer that way. > > I'm pretty sure the problem is not the compiler because, as far as I > can see, both architectures' compilers emit ".weak" directives. If > there is a problem, I'd say it's in the linker. > > Your test case gives two different addresses on Lucid x86, and on > ARM (so you say, I've not tested it), but the same address twice on > Precise. This is a surprising result. *I* would have expected that > static values in different dlopen'd libraries would not be unified, > but apparently they are ... somtimes. I suspect this is a compiler bug around the handling of STB_GNU_UNIQUE_OBJECT- something I suspect was invented to solve the problem in this space - it should all just work in the GNU/Linux world. The assembler files on x86_64 from the small testcase have .type _ZN1AIiE1aE, @gnu_unique_object while the one in case of ARM doesn't have this. However my suspicion about the problem is around the fact that GCC in it's build process emits .type x, @gnu_unique_object to check whether this feature is supported by the GNU assembler. Historically `@' has been a comment character on ARM . So, the compiler doesn't know that GNU_UNIQUE_OBJECT is supported in the assembler and it all falls apart very quickly after that and therefore doesn't generate such code. ... The quickest workaround IMHO is for a new compiler build that is rebuilt with --enable-gnu-unique-object. Given this feature went into a not very recent version of binutils, I would expect most recent assemblers to support this feature and for this to just work (TM). I would expect this configure option to be turned on for cross-compilers as well. It might also be the fastest way of testing this feature. Thoughts ? I would like another set of eyes on this. I verified this works on an armel box by : (natty)lp-ramana@jenipapo:~/cpp_unique_global$ diff -au f12.s f1.s | less --- f12.s 2012-03-07 00:47:32.000000000 +0000 +++ f1.s 2012-03-06 23:25:54.000000000 +0000 @@ -130,7 +130,7 @@ .weak _ZN1AIiE1aE .section .bss._ZN1AIiE1aE,"awG",%nobits,_ZN1AIiE1aE,comdat .align 2 - .type _ZN1AIiE1aE, %object + .type _ZN1AIiE1aE, %gnu_unique_object .size _ZN1AIiE1aE, 4 _ZN1AIiE1aE: .space 4 and the same for f2.s, regenerating by hand libf1.so and libf2.so and the output generated is : (natty)lp-ramana@jenipapo:~/cpp_unique_global$ LD_LIBRARY_PATH=. ./main f1 0x40028034 f2 0x40028034 regards, Ramana On 6 March 2012 16:00, Andrew Stubbs <andrew.stu...@linaro.org> wrote: > Hi Alexandros, > > Could you use the linaro-toolchain list for stuff like this please? You're > more likely to find somebody who knows the answer that way. > > I'm pretty sure the problem is not the compiler because, as far as I can > see, both architectures' compilers emit ".weak" directives. If there is a > problem, I'd say it's in the linker. > > Your test case gives two different addresses on Lucid x86, and on ARM (so > you say, I've not tested it), but the same address twice on Precise. This is > a surprising result. *I* would have expected that static values in different > dlopen'd libraries would not be unified, but apparently they are ... > somtimes. > > I'm afraid I don't really have any insight here. :( > > Anyway, regardless of whether one is correct, or not, I'd suggest *not* > relying on this behaviour - it's clearly not portable. I say leave it at > arm's length in production software for a few years. > > Andrew > > On 06/03/12 14:27, Alexandros Frantzis wrote: >> >> On Tue, Mar 06, 2012 at 09:51:01AM +0800, Sam Spilsbury wrote: >>> >>> On Mon, Mar 5, 2012 at 11:50 PM, Alexandros Frantzis >>> <alexandros.frant...@linaro.org> wrote: >>>> >>>> Hi all, >>>> >>>> this is an update on my progress with the updated compiz branches. >>>> >>>> I have been trying to run our update compiz branches >>>> (compiz-*/linaro-gles2-update) on ARM (precise armhf), but I have >>>> stumbled onto >>>> the same issue Marc reported some days ago. In particular, I get: >>>> >>>> /usr/bin/compiz (core) - Fatal: Private index value >>>> "15CompositeWindow_index_4" already stored in screen. >>>> /usr/bin/compiz (core) - Fatal: Private index value >>>> "15CompositeScreen_index_4" already stored in screen. >>>> >>>> and then a segfault when I try to run compiz. >>>> >>>> Note that I *don't* have this problem when running on x86_64 precise. >>>> >>>> The issue can be recreated with: >>>> >>>> $ compiz composite opengl >>>> >>>> I added some debugging messages to pluginclasshandler.h to get a better >>>> feeling of what is going on, and ran on both my desktop and on ARM. >>>> This is the output near the point when GLScreen get initialized: >>>> >>>> ... >>>> >>>> compiz (core) - Info: get(): mIndex.initiated for "8GLScreen_index_4" : >>>> 0 >>>> compiz (core) - Info: initializeIndex(): Initializining index value >>>> "8GLScreen_index_4" >>>> compiz (core) - Info: initializeIndex(): Private index value added for >>>> "8GLScreen_index_4" >>>> compiz (core) - Info: getInstance(): Get instance for >>>> "8GLScreen_index_4" >>>> compiz (core) - Info: getInstance(): Spawning new class for >>>> "8GLScreen_index_4" >>>> compiz (core) - Info: ctor(): mIndex.initiated for "8GLScreen_index_4" : >>>> 1 >>>> compiz (core) - Info: ctor(): Increasing reference count for >>>> "8GLScreen_index_4": 1 >>>> >>>> --- x86_64 --- >>>> compiz (core) - Info: get(): mIndex.initiated for >>>> "15CompositeScreen_index_4" : 1 >>>> --- armhf --- >>>> compiz (core) - Info: get(): mIndex.initiated for >>>> "15CompositeScreen_index_4" : 0 >>>> compiz (core) - Info: initializeIndex(): Initializining index value >>>> "15CompositeScreen_index_4" >>>> compiz (core) - Fatal: initializeIndex(): Private index value >>>> "15CompositeScreen_index_4" already stored in screen. >>> >>> >>> After the composite plugin loads and mIndex.initiated is set to 1, >>> place a watchpoint on mIndex.initiated (it should be a separate >>> template instantiation for each different class) and check if it >>> changes, or check if we are reading mIndex.initiated from a different >>> address, and if so, check the addresses of this for each constructor >>> and destructor being called. (could be a compiler bug, I've hit these >>> on this part of the code before). >>> >>>> ------------- >>>> >>>> In the armhf case, CompositeScreen is erroneously considered not >>>> initialized, and is initialiazed again, therefore messing up the plugin >>>> system. >>>> >>>> I am trying to figure out if this is a manifestation of some kind of >>>> memory >>>> corruption that doesn't affect us on x86_64 for whatever reason >>>> (alignment, >>>> integer size etc), or something completely different. >>>> >>>> Thoughts? >>>> >>>> Thanks, >>>> Alexandros >>> >>> >>> >>> >>> -- >>> Sam Spilsbury >>> >> >> Hi all, >> >> (I have also added Michael, Andrew and Ulrich from the Linaro toolchain >> group >> to the recipients. Hi!) >> >> Checking the addresses, as Sam suggested, I found that there are two >> different >> PluginClassHandler<CompositeScreen, CompScreen, 4>::mIndex and >> PluginClassHandler<CompositeWindow, CompWindow, 4>::mIndex objects. >> >> After a bit of investigation, objdump gave an explanation: >> >> objdump -t /usr/lib/compiz/libcomposite.so | c++filt | grep mIndex >> >> -- x86_64 -- >> 0000000000277a80 u O .bss 0000000000000010 >> PluginClassHandler<CompositeWindow, CompWindow, 4>::mIndex >> 0000000000277a70 u O .bss 0000000000000010 >> PluginClassHandler<CompositeScreen, CompScreen, 4>::mIndex >> -- armhf -- >> 00065648 w O .bss 00000010 PluginClassHandler<CompositeWindow, >> CompWindow, 4>::mIndex >> 00065658 w O .bss 00000010 PluginClassHandler<CompositeScreen, >> CompScreen, 4>::mIndex >> ------------ >> >> And the same kind of output for libopengl.so >> >> On x86_64 the symbols are marked 'u': 'unique global', whereas on armhf >> they are marked 'w': 'weak'. This seems to be causing our troubles. >> >> I have produced a small test case for this: >> >> http://people.linaro.org/~afrantzis/cpp_unique_global.tar.gz >> >> Building and running 'LD_LIBRARY_PATH=. ./main' on x86_64 prints out f1 >> and f2 >> with the same address, whereas on armhf the addresses are different (i.e. >> two >> different objects). On x86_64 the symbol A<int>::a is 'u', on armhf it is >> 'w'. >> >> For completeness, when running without templates (edit a.h to change) the >> two >> printed addresses are different on both x86_64 and armhf. Also A::a is >> 'g': >> 'normal global' for both. >> >> Michael, Andrew, Ulrich can you please give us some insight into the >> situation? >> Does this seem like a compiler or linker bug on ARM, or is the code >> depending >> on undefined behavior, or something different? I have pasted the used g++ >> versions at the end of the email. >> >> Thanks, >> Alexandros >> >> --- g++ x86_64 -- >> Using built-in specs. >> COLLECT_GCC=g++ >> COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6/lto-wrapper >> Target: x86_64-linux-gnu >> Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro >> 4.6.3-1ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs >> --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr >> --program-suffix=-4.6 --enable-shared --enable-linker-build-id >> --with-system-zlib --libexecdir=/usr/lib --without-included-gettext >> --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 >> --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu >> --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin >> --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic >> --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu >> --target=x86_64-linux-gnu >> Thread model: posix >> gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu2) >> >> --- g++ armhf -- >> Using built-in specs. >> COLLECT_GCC=g++ >> COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/4.6/lto-wrapper >> Target: arm-linux-gnueabihf >> Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro >> 4.6.3-1ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs >> --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr >> --program-suffix=-4.6 --enable-shared --enable-linker-build-id >> --with-system-zlib --libexecdir=/usr/lib --without-included-gettext >> --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 >> --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu >> --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin >> --enable-objc-gc --enable-multilib --disable-sjlj-exceptions >> --with-arch=armv7-a --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb >> --disable-werror --enable-checking=release --build=arm-linux-gnueabihf >> --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf >> Thread model: posix >> gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu2) >> > > > _______________________________________________ > linaro-toolchain mailing list > linaro-toolchain@lists.linaro.org > http://lists.linaro.org/mailman/listinfo/linaro-toolchain _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain