On Thu, Apr 24, 2014 at 3:37 AM, Thomas Preud'homme <thomas.preudho...@arm.com> wrote: > See updated part 3 of the patch in attachment. New ChangeLog are as follows > > *** gcc/ChangeLog *** > > 2014-04-23 Thomas Preud'homme <thomas.preudho...@arm.com> > > PR tree-optimization/54733 > * tree-ssa-math-opts.c (find_bswap_load): Renamed to ... > (find_bswap_or_nop_load): This. > (find_bswap_1): Renamed to ... > (find_bswap_or_nop_1): This. > (find_bswap): Renamed to ... > (find_bswap_or_nop): This. Also add detection of bitwise operations > equivalent to load in host endianness. > (execute_optimize_bswap): Likewise. > > *** gcc/testsuite/ChangeLog *** > > 2014-04-23 Thomas Preud'homme <thomas.preudho...@arm.com> > > PR tree-optimization/54733 > * gcc.dg/optimize-bswapdi-3.c: Extend test to check detection of > bitwise operation equivalent to load in host endianness. > * gcc.dg/optimize-bswaphi-1.c: Likewise. > * gcc.dg/optimize-bswapsi-2.c: Likewise. > > Bootstrapped on x86_64-linux-gnu with no testsuite regression. Also did a > arm-none-eabi cross build with no regression after running testsuite via qemu
execute_optimize_bswap now looks a bit convoluted - maybe time to split out the actual transform to a separate function? + /* Convert the result of load if necessary. */ + if (!useless_type_conversion_p (TREE_TYPE (tgt), + TREE_TYPE (val_tmp))) + { why's that? Shouldn't the load already be emitted of the correct type? You seem to replace the stmt computing the target value by directly loading into the target. IMHO that's premature optimization and it would be easier to just replace its rhs (that way the stmt still has a proper location for example). And now that I am looking and the patch series again - I think you need to verify that you load the correct value. Consider int foo (char *x, char *y) { char c0 = x[0]; char c1 = x[1]; *y = 1; char c2 = x[2]; char c3 = x[3]; return c0 | c1 << 8 | c2 << 16 | c3 << 24; } where do you insert the single load? The easiest way out (without doing alias walks and whatnot) is to verify that all loads have the same gimple_vuse () attached (also please set that gimple_vuse () on the load you build - that avoids costly computation of virtual operands). This of course applies to patch #2 already. Thanks, Richard. > Best regards, > > Thomas