On Sun, Oct 28, 2012 at 2:37 AM, H.J. Lu <hjl.to...@gmail.com> wrote:
>>> As suggested by Richard S. [1], after the patch that converts subreg:M >>> (op:N (...)(...)) to op:M (subreg:M (...) subreg:M (...)), we can >>> remove several peephole2 patterns that handle subregs of PLUS, MINUS >>> and MULT operators. I have attached RFC prototype patch that will >>> trigger an ICE when to-be-removed pattern triggers, with the intention >>> that these patterns wil be removed entirely (An "invalid" pattern was >>> indeed generated elsewhere, see patch). I have committed following version that avoids all failures, reported by H.J.: 2012-10-29 Uros Bizjak <ubiz...@gmail.com> * config/i386/i386.c (ix86_decompose_address): Use simplify_gen_subreg to generate SImode equivalent of address, zero-extended with AND RTX. * config/i386/i386.md (ashift to lea splitter): Split to SImode mult. (simple lea to add/shift peephole2s): Remove peephole2s that operate on subregs of DImode operations. Re-tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN. Uros.
Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 192906) +++ config/i386/i386.c (working copy) @@ -11821,7 +11821,11 @@ ix86_decompose_address (rtx addr, struct ix86_addr return 0; } else if (GET_MODE (addr) == DImode) - addr = gen_rtx_SUBREG (SImode, addr, 0); + { + addr = simplify_gen_subreg (SImode, addr, DImode, 0); + if (addr == NULL_RTX) + return 0; + } else if (GET_MODE (addr) != VOIDmode) return 0; } Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 192906) +++ config/i386/i386.md (working copy) @@ -9600,10 +9600,10 @@ "TARGET_64BIT && reload_completed && true_regnum (operands[0]) != true_regnum (operands[1])" [(set (match_dup 0) - (zero_extend:DI (subreg:SI (mult:DI (match_dup 1) (match_dup 2)) 0)))] + (zero_extend:DI (mult:SI (match_dup 1) (match_dup 2))))] { - operands[1] = gen_lowpart (DImode, operands[1]); - operands[2] = gen_int_mode (1 << INTVAL (operands[2]), DImode); + operands[1] = gen_lowpart (SImode, operands[1]); + operands[2] = gen_int_mode (1 << INTVAL (operands[2]), SImode); }) ;; This pattern can't accept a variable shift count, since shifts by @@ -17358,28 +17358,6 @@ (clobber (reg:CC FLAGS_REG))])]) (define_peephole2 - [(set (match_operand:SI 0 "register_operand") - (subreg:SI (plus:DI (match_operand:DI 1 "register_operand") - (match_operand:DI 2 "nonmemory_operand")) 0))] - "TARGET_64BIT && !TARGET_OPT_AGU - && REGNO (operands[0]) == REGNO (operands[1]) - && peep2_regno_dead_p (0, FLAGS_REG)" - [(parallel [(set (match_dup 0) (plus:SI (match_dup 0) (match_dup 2))) - (clobber (reg:CC FLAGS_REG))])] - "operands[2] = gen_lowpart (SImode, operands[2]);") - -(define_peephole2 - [(set (match_operand:SI 0 "register_operand") - (subreg:SI (plus:DI (match_operand:DI 1 "nonmemory_operand") - (match_operand:DI 2 "register_operand")) 0))] - "TARGET_64BIT && !TARGET_OPT_AGU - && REGNO (operands[0]) == REGNO (operands[2]) - && peep2_regno_dead_p (0, FLAGS_REG)" - [(parallel [(set (match_dup 0) (plus:SI (match_dup 0) (match_dup 1))) - (clobber (reg:CC FLAGS_REG))])] - "operands[1] = gen_lowpart (SImode, operands[1]);") - -(define_peephole2 [(set (match_operand:DI 0 "register_operand") (zero_extend:DI (plus:SI (match_operand:SI 1 "register_operand") @@ -17404,36 +17382,6 @@ (clobber (reg:CC FLAGS_REG))])]) (define_peephole2 - [(set (match_operand:DI 0 "register_operand") - (zero_extend:DI - (subreg:SI (plus:DI (match_dup 0) - (match_operand:DI 1 "nonmemory_operand")) 0)))] - "TARGET_64BIT && !TARGET_OPT_AGU - && peep2_regno_dead_p (0, FLAGS_REG)" - [(parallel [(set (match_dup 0) - (zero_extend:DI (plus:SI (match_dup 2) (match_dup 1)))) - (clobber (reg:CC FLAGS_REG))])] -{ - operands[1] = gen_lowpart (SImode, operands[1]); - operands[2] = gen_lowpart (SImode, operands[0]); -}) - -(define_peephole2 - [(set (match_operand:DI 0 "register_operand") - (zero_extend:DI - (subreg:SI (plus:DI (match_operand:DI 1 "nonmemory_operand") - (match_dup 0)) 0)))] - "TARGET_64BIT && !TARGET_OPT_AGU - && peep2_regno_dead_p (0, FLAGS_REG)" - [(parallel [(set (match_dup 0) - (zero_extend:DI (plus:SI (match_dup 2) (match_dup 1)))) - (clobber (reg:CC FLAGS_REG))])] -{ - operands[1] = gen_lowpart (SImode, operands[1]); - operands[2] = gen_lowpart (SImode, operands[0]); -}) - -(define_peephole2 [(set (match_operand:SWI48 0 "register_operand") (mult:SWI48 (match_dup 0) (match_operand:SWI48 1 "const_int_operand")))] @@ -17444,18 +17392,6 @@ "operands[1] = GEN_INT (exact_log2 (INTVAL (operands[1])));") (define_peephole2 - [(set (match_operand:SI 0 "register_operand") - (subreg:SI (mult:DI (match_operand:DI 1 "register_operand") - (match_operand:DI 2 "const_int_operand")) 0))] - "TARGET_64BIT - && exact_log2 (INTVAL (operands[2])) >= 0 - && REGNO (operands[0]) == REGNO (operands[1]) - && peep2_regno_dead_p (0, FLAGS_REG)" - [(parallel [(set (match_dup 0) (ashift:SI (match_dup 0) (match_dup 2))) - (clobber (reg:CC FLAGS_REG))])] - "operands[2] = GEN_INT (exact_log2 (INTVAL (operands[2])));") - -(define_peephole2 [(set (match_operand:DI 0 "register_operand") (zero_extend:DI (mult:SI (match_operand:SI 1 "register_operand") @@ -17469,22 +17405,6 @@ (clobber (reg:CC FLAGS_REG))])] "operands[2] = GEN_INT (exact_log2 (INTVAL (operands[2])));") -(define_peephole2 - [(set (match_operand:DI 0 "register_operand") - (zero_extend:DI - (subreg:SI (mult:DI (match_dup 0) - (match_operand:DI 1 "const_int_operand")) 0)))] - "TARGET_64BIT - && exact_log2 (INTVAL (operands[2])) >= 0 - && peep2_regno_dead_p (0, FLAGS_REG)" - [(parallel [(set (match_dup 0) - (zero_extend:DI (ashift:SI (match_dup 2) (match_dup 1)))) - (clobber (reg:CC FLAGS_REG))])] -{ - operands[1] = GEN_INT (exact_log2 (INTVAL (operands[1]))); - operands[2] = gen_lowpart (SImode, operands[0]); -}) - ;; The ESP adjustments can be done by the push and pop instructions. Resulting ;; code is shorter, since push is only 1 byte, while add imm, %esp is 3 bytes. ;; On many CPUs it is also faster, since special hardware to avoid esp