On 27 June 2012 15:58, Dmitry Melnik <d...@ispras.ru> wrote: > Hi, > > We'd like to note about CodeSourcery's patch for ARM backend, from which GCC > mainline can gain 4% on SPEC2K INT: > http://cgit.openembedded.org/openembedded/plain/recipes/gcc/gcc-4.5/linaro/gcc-4.5-linaro-r99369.patch > (also the patch is attached). > > Originally, we noticed that GNU Go works 6% faster on cortex-a8 with > -fno-gcse. After profiling we found that this is most likely caused by > cache misses when accessing global variables. GCC generates ldr > instructions for them, while this can be avoided by emitting movt/movw pair > for such cases. RTL expressions for these instructions is high_ and lo_sum. > Currently, symbol_ref expands as high_ and lo_sum but then cprop1 decides > that this is redundant and merges them into one load insn. > > The problem was also found by Linaro community: > https://bugs.launchpad.net/gcc-linaro/+bug/886124 .
The reason IIRC this isn't in our later releases is that it wasn't thought beneficial enough to upstream. Now you've got some evidence to the contrary. > Also there is a patch from codesourcery (attached), which was ported to > linaro gcc 4.5, but is missing in later linaro releases. > This patch makes split of symbol_refs at the later stage (after cprop), > instead of generating movt/movw at expand. I must admit that I had been suggesting to Zhenqiang about turning this off by tightening the movsi_insn predicates rather than adding a split, but given that it appears to produce enough benefit in this case I don't have any reasons to object ... However it's interesting that this doesn't seem to help vpr .... Ramana