On Thu, Nov 28, 2013 at 2:11 PM, Vladimir Makarov <vmaka...@redhat.com> wrote:
>   The following patch fixes PR57293
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57293
>
>   It is actually an implementation of missed LRA functionality in reg
> elimination.  Before the patch any explicit change of stack pointer in
> RTL resulted in necessity to use the frame pointer.
>
>   The patch has practically no effect on generic tuning of x86/x86-64.
> But it has a dramatic effect on code performance for other tunings
> like corei7 which don't use incoming args accumulation.  The maximum
> SPEC2000 improvement 2.5% is achieved on x86 SPECInt2000.  But
> SPECFP2000 rate also has improvement about 1% on x86 and x86-64.  Too
> bad that I did not implement it at the first place.  The results would
> have been even much better ones reported on 2012 GNU Cauldron as I
> also used -mtune=corei7 that time.
>
> The patch was bootstrapped and tested on x86-64/x86 and ppc.
>
> Committed as rev. 205498.
>
>  2013-11-28  Vladimir Makarov<vmaka...@redhat.com>
>
>         PR target/57293
>         * ira.h (ira_setup_eliminable_regset): Remove parameter.
>         * ira.c (ira_setup_eliminable_regset): Ditto.  Add
>         SUPPORTS_STACK_ALIGNMENT for crtl->stack_realign_needed.
>         Don't call lra_init_elimination.
>         (ira): Call ira_setup_eliminable_regset without arguments.
>         * loop-invariant.c (calculate_loop_reg_pressure): Remove argument
>         from ira_setup_eliminable_regset call.
>         * gcse.c (calculate_bb_reg_pressure): Ditto.
>         * haifa-sched.c (sched_init): Ditto.
>         * lra.h (lra_init_elimination): Remove the prototype.
>         * lra-int.h (lra_insn_recog_data): New member sp_offset.  Move
>         used_insn_alternative upper.
>         (lra_eliminate_regs_1): Add one more parameter.
>         (lra-eliminate): Ditto.
>         * lra.c (lra_invalidate_insn_data): Set sp_offset.
>         (setup_sp_offset): New.
>         (lra_process_new_insns): Call setup_sp_offset.
>         (lra): Add argument to lra_eliminate calls.
>         * lra-constraints.c (get_equiv_substitution): Rename to get_equiv.
>         (get_equiv_with_elimination): New.
>         (process_addr_reg): Call get_equiv_with_elimination instead of
>         get_equiv_substitution.
>         (equiv_address_substitution): Ditto.
>         (loc_equivalence_change_p): Ditto.
>         (loc_equivalence_callback, lra_constraints): Ditto.
>         (curr_insn_transform): Ditto.  Print the sp offset
>         (process_alt_operands): Prevent stack pointer reloads.
>         (lra_constraints): Remove one argument from lra_eliminate call.
>         Move it up.  Mark used hard regs bfore it.  Use
>         get_equiv_with_elimination instead of get_equiv_substitution.
>         * lra-eliminations.c (lra_eliminate_regs_1): Add parameter and
>         assert for param values combination.  Use sp offset.  Add argument
>         to lra_eliminate_regs_1 calls.
>         (lra_eliminate_regs): Add argument to lra_eliminate_regs_1 call.
>         (curr_sp_change): New static var.
>         (mark_not_eliminable): Add parameter.  Update curr_sp_change.
>         Don't prevent elimination to sp if we can calculate its change.
>         Pass the argument to mark_not_eliminable calls.
>         (eliminate_regs_in_insn): Add a parameter.  Use sp offset.  Add
>         argument to lra_eliminate_regs_1 call.
>         (update_reg_eliminate): Move calculation of hard regs for spill
>         lower.  Switch off lra_in_progress temporarily to generate regs
>         involved into elimination.
>         (lra_init_elimination): Rename to init_elimination.  Make it
>         static.  Set up insn sp offset, check the offsets at the end of
>         BBs.
>         (process_insn_for_elimination): Add parameter.  Pass its value to
>         eliminate_regs_in_insn.
>         (lra_eliminate): : Add parameter.  Pass its value to
>         process_insn_for_elimination.  Add assert for param values
>         combination.  Call init_elimination.  Don't update offsets in
>         equivalence substitutions.
>         * lra-spills.c (assign_mem_slot): Don't call lra_eliminate_regs_1
>         for created stack slot.
>         (remove_pseudos): Call lra_eliminate_regs_1 before changing memory
>         onto stack slot.
>

Hi Vladimir,

Thanks for your hard work.   I noticed a few regressions
on x86-64:

FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 23 z == 8
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -fomit-frame-pointer  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -fomit-frame-pointer  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -fomit-frame-pointer  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -g  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -g  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -g  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c  -O2 -flto -fno-use-linker-plugin
-flto-partition=none  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c  -O2 -flto -fno-use-linker-plugin
-flto-partition=none  line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 23 z == 8
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 23 z == 8
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 23 z == 8
FAIL: gcc.dg/guality/pr54693-2.c  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions  line 21 i == v + 1
FAIL: gcc.dg/guality/pr54693-2.c  -O3 -fomit-frame-pointer
-funroll-loops  line 21 i == v + 1
FAIL: gcc.target/i386/pr9771-1.c (internal compiler error)

Besides guality failures, there is

[hjl@gnu-6 gcc]$ make check-gcc
RUNTESTFLAGS="--target_board='unix{-m32\ -march=corei7}'
i386.exp=pr9771-1.c"
make[1]: Entering directory `/export/build/gnu/gcc/build-x86_64-linux/gcc'
test -d plugin || mkdir plugin
test -d testsuite || mkdir testsuite
test -d testsuite/gcc || mkdir testsuite/gcc
(rootme=`${PWDCMD-pwd}`; export rootme; \
srcdir=`cd /export/gnu/import/git/gcc/gcc; ${PWDCMD-pwd}` ; export srcdir ; \
cd testsuite/gcc; \
rm -f tmp-site.exp; \
sed '/set tmpdir/ s|testsuite$|testsuite/gcc|' \
    < ../../site.exp > tmp-site.exp; \
/bin/sh ${srcdir}/../move-if-change tmp-site.exp site.exp; \
EXPECT=`if [ -f ${rootme}/../expect/expect ] ; then echo
${rootme}/../expect/expect ; else echo expect ; fi` ; export EXPECT ;
\
if [ -f ${rootme}/../expect/expect ] ; then  \
   TCL_LIBRARY=`cd .. ; cd ${srcdir}/../tcl/library ; ${PWDCMD-pwd}` ; \
    export TCL_LIBRARY ; fi ; \
runtestflags= ; \
if [ -n "" ] ; then \
  runtestflags=""; \
elif [ -n "" ] ; then \
  parts="`echo '  ' \
      | sed 's/=[^ ]* / /g'`"; \
  for part in `find $srcdir/testsuite/gcc* -name \*.exp` ; do \
    part=`basename $part` ; \
    case " $parts $runtestflags " in \
      *" $part "*) ;; \
      *) runtestflags="$runtestflags $part" ;; \
    esac ; \
  done ; \
fi ; \
`if [ -f ${srcdir}/../dejagnu/runtest ] ; then echo
${srcdir}/../dejagnu/runtest ; else echo runtest; fi` --tool gcc
--target_board='unix{-m32\ -march=corei7}' i386.exp=pr9771-1.c
$runtestflags)
WARNING: Couldn't find the global config file.
Test Run By hjl on Thu Nov 28 16:50:05 2013
Native configuration is x86_64-unknown-linux-gnu

        === gcc tests ===

Schedule of variations:
    unix/-m32 -march=corei7

Running target unix/-m32 -march=corei7
Using /usr/share/dejagnu/baseboards/unix.exp as board description file
for target.
Using /usr/share/dejagnu/config/unix.exp as generic interface file for target.
Using /export/gnu/import/git/gcc/gcc/testsuite/config/default.exp as
tool-and-target-specific interface file.
Running /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/i386.exp ...
FAIL: gcc.target/i386/pr9771-1.c (internal compiler error)
FAIL: gcc.target/i386/pr9771-1.c (test for excess errors)

        === gcc Summary ===

# of expected passes        10
# of unexpected failures    2
# of unresolved testcases    1

"-m32 -march=corei7" triggers the ICE.


-- 
H.J.

Reply via email to