https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320
--- Comment #18 from Thomas Preud'homme <thomas.preudhomme at arm dot com> --- (In reply to Eric Botcazou from comment #16) > > unsigned int foo (unsigned short *x) > > { > > return x[0] << 16 | x[1]; > > } > > > > [...] > > gets you > > > > foo: > > lduh [%o0], %g1 > > lduh [%o0+2], %o0 > > sll %g1, 16, %g1 > > jmp %o7+8 > > or %o0, %g1, %o0 > > > > which looks perfect to me. > > Indeed, but after having gone through a perfectly useless transformation and > wasted cycles. This reminds me of the ipa-split + inlining round trip. > > Really SPARC machines aren't fast enough to allow such a silliness... Fair enough but the information about alignment is only available late in the pass so that most of the code is already executed. Only when the whole OR expression has been processed do we know what is the lowest address and the range of the memory access and therefore whether that access is aligned or not. Also if the expression was loading a 32 bit value byte by byte then the transformation would be useful. I'm already working on a patch to add a cost model but this will just add more code to execute before taking the decision. It will however prevent rewriting statements if the result will execute slower on the target. Maybe a better solution for sparc would be to add a switch for this pass and disable it by default on sparc. What do you think about that?