> On May 16, 2014, at 3:48 AM, Richard Biener <richard.guent...@gmail.com> > wrote: > > On Fri, May 16, 2014 at 12:07 PM, Thomas Preud'homme > <thomas.preudho...@arm.com> wrote: >> Ping? > > Sorry ... > >> Best regards, >> >> Thomas Preud'homme >> >>> -----Original Message----- >>> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- >>> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme >>> Sent: Friday, May 09, 2014 6:26 PM >>> To: GCC Patches >>> Subject: RE: [PATCH] Fix PR54733 Optimize endian independent load/store >>> >>> Sorry, took longer than expected as I got distracted by some other patch. >>> I merged the whole patchset in a single patch as I was told the current >>> setup >>> is actually more difficult to read. >>> >>> Here are the updated ChangeLogs: >>> >>> *** gcc/ChangeLog *** >>> >>> 2014-05-09 Thomas Preud'homme <thomas.preudho...@arm.com> >>> >>> PR tree-optimization/54733 >>> * expr.c (get_inner_reference): Add a parameter to control whether >>> a >>> MEM_REF should be split into base + offset. >>> * tree.h (get_inner_reference): Default new parameter to false. >>> * tree-ssa-math-opts.c (nop_stats): New "bswap_stats" structure. >>> (CMPNOP): Define. >>> (find_bswap_or_nop_load): New. >>> (find_bswap_1): Renamed to ... >>> (find_bswap_or_nop_1): This. Also add support for memory source. >>> (find_bswap): Renamed to ... >>> (find_bswap_or_nop): This. Also add support for memory source and >>> detection of bitwise operations equivalent to load in host endianness. >>> (execute_optimize_bswap): Likewise. Also move its leading >>> comment back >>> in place and split statement transformation into ... >>> (bswap_replace): This. Add assert when updating bswap_stats. >>> >>> *** gcc/testsuite/ChangeLog *** >>> >>> 2014-05-09 Thomas Preud'homme <thomas.preudho...@arm.com> >>> >>> PR tree-optimization/54733 >>> * gcc.dg/optimize-bswapdi-3.c: New test to check extension of >>> bswap >>> optimization to support memory sources and bitwise operations >>> equivalent to load in host endianness. >>> * gcc.dg/optimize-bswaphi-1.c: Likewise. >>> * gcc.dg/optimize-bswapsi-2.c: Likewise. >>> * gcc.c-torture/execute/bswap-2.c: Likewise. >>> >>> Ok for trunk? > > Ok, I now decided otherwise and dislike the new parameter to > get_inner_reference. Can you please revert that part and just > deal with a MEM_REF result in your only caller? > > And (of course) I found another possible issue. The way you > compute load_type and use it here: > > + /* Perform the load. */ > + load_offset_ptr = build_int_cst (n->alias_set, 0); > + val_expr = fold_build2 (MEM_REF, load_type, addr_tmp, > + load_offset_ptr); > > makes the load always appear aligned according to the mode of > load_type. On strict-alignment targets this may cause faults. > > So what you have to do is either (simpler) > > unsigned int align = get_pointer_alignment (addr_tmp); > tree al_load_type = load_type; > if (align < TYPE_ALIGN (load_type)) > al_load_type = build_aligned_type (load_type, align); > ... > val_expr = fold_build2 (MEM_REF, al_load_type, addr_tmp, > load_offset_ptr); > > or keep track of the "first" actual load and use > > unsigned int align = get_object_alignment (that_first_load); > > "first" in the one that corresponds to addr_tmp. From that there > is a much better chance to derive good alignment values. > > Of course on STRICT_ALIGNMENT targets a not aligned load > will be decomposed again, so eventually doing the transformation > may no longer be profitable(?).
Not always decomposed. On MIPS, it should using the load/store left/right instructions for unaligned load/stores which is normally better than decomposed load/stores. So having a cost model would be nice. Thanks, Andrew > > Thanks and sorry again for the delay. > > Otherwise the patch looks good to me. > > Richard. > >>> Best regards, >>> >>> Thomas >>> >>>> -----Original Message----- >>>> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- >>>> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme >>>> Sent: Monday, May 05, 2014 7:30 PM >>>> To: GCC Patches >>>> Subject: RE: [PATCH][2/3] Fix PR54733 Optimize endian independent >>>> load/store >>>> >>>> I found a way to improve the function find_bswap/find_bswap_or_nop >>>> and reduce its size. Please hold for the review, I will post an updated >>>> version as soon as I finish testing. >>>> >>>> Best regards, >>>> >>>> Thomas Preud'homme >> >>