On Thu, Nov 28, 2013 at 2:11 PM, Vladimir Makarov <vmaka...@redhat.com> wrote: > The following patch fixes PR57293 > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57293 > > It is actually an implementation of missed LRA functionality in reg > elimination. Before the patch any explicit change of stack pointer in > RTL resulted in necessity to use the frame pointer. > > The patch has practically no effect on generic tuning of x86/x86-64. > But it has a dramatic effect on code performance for other tunings > like corei7 which don't use incoming args accumulation. The maximum > SPEC2000 improvement 2.5% is achieved on x86 SPECInt2000. But > SPECFP2000 rate also has improvement about 1% on x86 and x86-64. Too > bad that I did not implement it at the first place. The results would > have been even much better ones reported on 2012 GNU Cauldron as I > also used -mtune=corei7 that time. > > The patch was bootstrapped and tested on x86-64/x86 and ppc. > > Committed as rev. 205498. > > 2013-11-28 Vladimir Makarov<vmaka...@redhat.com> > > PR target/57293 > * ira.h (ira_setup_eliminable_regset): Remove parameter. > * ira.c (ira_setup_eliminable_regset): Ditto. Add > SUPPORTS_STACK_ALIGNMENT for crtl->stack_realign_needed. > Don't call lra_init_elimination. > (ira): Call ira_setup_eliminable_regset without arguments. > * loop-invariant.c (calculate_loop_reg_pressure): Remove argument > from ira_setup_eliminable_regset call. > * gcse.c (calculate_bb_reg_pressure): Ditto. > * haifa-sched.c (sched_init): Ditto. > * lra.h (lra_init_elimination): Remove the prototype. > * lra-int.h (lra_insn_recog_data): New member sp_offset. Move > used_insn_alternative upper. > (lra_eliminate_regs_1): Add one more parameter. > (lra-eliminate): Ditto. > * lra.c (lra_invalidate_insn_data): Set sp_offset. > (setup_sp_offset): New. > (lra_process_new_insns): Call setup_sp_offset. > (lra): Add argument to lra_eliminate calls. > * lra-constraints.c (get_equiv_substitution): Rename to get_equiv. > (get_equiv_with_elimination): New. > (process_addr_reg): Call get_equiv_with_elimination instead of > get_equiv_substitution. > (equiv_address_substitution): Ditto. > (loc_equivalence_change_p): Ditto. > (loc_equivalence_callback, lra_constraints): Ditto. > (curr_insn_transform): Ditto. Print the sp offset > (process_alt_operands): Prevent stack pointer reloads. > (lra_constraints): Remove one argument from lra_eliminate call. > Move it up. Mark used hard regs bfore it. Use > get_equiv_with_elimination instead of get_equiv_substitution. > * lra-eliminations.c (lra_eliminate_regs_1): Add parameter and > assert for param values combination. Use sp offset. Add argument > to lra_eliminate_regs_1 calls. > (lra_eliminate_regs): Add argument to lra_eliminate_regs_1 call. > (curr_sp_change): New static var. > (mark_not_eliminable): Add parameter. Update curr_sp_change. > Don't prevent elimination to sp if we can calculate its change. > Pass the argument to mark_not_eliminable calls. > (eliminate_regs_in_insn): Add a parameter. Use sp offset. Add > argument to lra_eliminate_regs_1 call. > (update_reg_eliminate): Move calculation of hard regs for spill > lower. Switch off lra_in_progress temporarily to generate regs > involved into elimination. > (lra_init_elimination): Rename to init_elimination. Make it > static. Set up insn sp offset, check the offsets at the end of > BBs. > (process_insn_for_elimination): Add parameter. Pass its value to > eliminate_regs_in_insn. > (lra_eliminate): : Add parameter. Pass its value to > process_insn_for_elimination. Add assert for param values > combination. Call init_elimination. Don't update offsets in > equivalence substitutions. > * lra-spills.c (assign_mem_slot): Don't call lra_eliminate_regs_1 > for created stack slot. > (remove_pseudos): Call lra_eliminate_regs_1 before changing memory > onto stack slot. >
Hi Vladimir, Thanks for your hard work. I noticed a few regressions on x86-64: FAIL: gcc.dg/guality/pr54519-1.c -O2 line 20 y == 25 FAIL: gcc.dg/guality/pr54519-1.c -O2 line 20 z == 6 FAIL: gcc.dg/guality/pr54519-1.c -O2 line 23 y == 117 FAIL: gcc.dg/guality/pr54519-1.c -O2 line 23 z == 8 FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 x == 36 FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 y == 25 FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 z == 6 FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 x == 36 FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 y == 25 FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 z == 6 FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 20 x == 36 FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto -fno-use-linker-plugin -flto-partition=none line 23 x == 98 FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 x == 36 FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 y == 25 FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 z == 6 FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 x == 98 FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 y == 117 FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 z == 8 FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 20 x == 36 FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 20 y == 25 FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 20 z == 6 FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 23 x == 98 FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 23 y == 117 FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 23 z == 8 FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 20 x == 36 FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 20 y == 25 FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 20 z == 6 FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 23 x == 98 FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 23 y == 117 FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 23 z == 8 FAIL: gcc.dg/guality/pr54693-2.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions line 21 i == v + 1 FAIL: gcc.dg/guality/pr54693-2.c -O3 -fomit-frame-pointer -funroll-loops line 21 i == v + 1 FAIL: gcc.target/i386/pr9771-1.c (internal compiler error) Besides guality failures, there is [hjl@gnu-6 gcc]$ make check-gcc RUNTESTFLAGS="--target_board='unix{-m32\ -march=corei7}' i386.exp=pr9771-1.c" make[1]: Entering directory `/export/build/gnu/gcc/build-x86_64-linux/gcc' test -d plugin || mkdir plugin test -d testsuite || mkdir testsuite test -d testsuite/gcc || mkdir testsuite/gcc (rootme=`${PWDCMD-pwd}`; export rootme; \ srcdir=`cd /export/gnu/import/git/gcc/gcc; ${PWDCMD-pwd}` ; export srcdir ; \ cd testsuite/gcc; \ rm -f tmp-site.exp; \ sed '/set tmpdir/ s|testsuite$|testsuite/gcc|' \ < ../../site.exp > tmp-site.exp; \ /bin/sh ${srcdir}/../move-if-change tmp-site.exp site.exp; \ EXPECT=`if [ -f ${rootme}/../expect/expect ] ; then echo ${rootme}/../expect/expect ; else echo expect ; fi` ; export EXPECT ; \ if [ -f ${rootme}/../expect/expect ] ; then \ TCL_LIBRARY=`cd .. ; cd ${srcdir}/../tcl/library ; ${PWDCMD-pwd}` ; \ export TCL_LIBRARY ; fi ; \ runtestflags= ; \ if [ -n "" ] ; then \ runtestflags=""; \ elif [ -n "" ] ; then \ parts="`echo ' ' \ | sed 's/=[^ ]* / /g'`"; \ for part in `find $srcdir/testsuite/gcc* -name \*.exp` ; do \ part=`basename $part` ; \ case " $parts $runtestflags " in \ *" $part "*) ;; \ *) runtestflags="$runtestflags $part" ;; \ esac ; \ done ; \ fi ; \ `if [ -f ${srcdir}/../dejagnu/runtest ] ; then echo ${srcdir}/../dejagnu/runtest ; else echo runtest; fi` --tool gcc --target_board='unix{-m32\ -march=corei7}' i386.exp=pr9771-1.c $runtestflags) WARNING: Couldn't find the global config file. Test Run By hjl on Thu Nov 28 16:50:05 2013 Native configuration is x86_64-unknown-linux-gnu === gcc tests === Schedule of variations: unix/-m32 -march=corei7 Running target unix/-m32 -march=corei7 Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target. Using /usr/share/dejagnu/config/unix.exp as generic interface file for target. Using /export/gnu/import/git/gcc/gcc/testsuite/config/default.exp as tool-and-target-specific interface file. Running /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/i386.exp ... FAIL: gcc.target/i386/pr9771-1.c (internal compiler error) FAIL: gcc.target/i386/pr9771-1.c (test for excess errors) === gcc Summary === # of expected passes 10 # of unexpected failures 2 # of unresolved testcases 1 "-m32 -march=corei7" triggers the ICE. -- H.J.