[Bug target/53987] [SH] Unnecessary zero-extension before cmp/eq
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987 --- Comment #4 from Oleg Endo olegendo at gcc dot gnu.org --- It seems that converting unsigned values to signed values, i.e. replacing zero-extensions with sign-extensions and recombining sign-extensions with loads could make sense in general. For example, the following code from CSiBE mpeg2dec-0.3.1/libmpeg2/motion_comp.s contains sequences like: _MC_put_x_8_c: .align 2 .L24: mov.b @r5,r0 sett extu.b r0,r1 mov.b @(1,r5),r0 extu.b r0,r0 addcr1,r0 sharr0 mov.b r0,@r4 sett mov.b @(1,r5),r0 extu.b r0,r3 mov.b @(2,r5),r0 extu.b r0,r1 mov r3,r0 addcr1,r0 sharr0 mov.b r0,@(1,r4) Here effectively only 8 bit values are calculated. The zero-extensions can be omitted, since the higher bits do not influence the result of the lowest 8 bits and the higher bits are discarded after the 8 bit stores.
[Bug target/53987] [SH] Unnecessary zero-extension before cmp/eq
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987 --- Comment #3 from Oleg Endo olegendo at gcc dot gnu.org --- (In reply to Oleg Endo from comment #2) As of rev 204180 (4.9) this problem still exists. As far as I understand, the actual root of the problem is that the 'unsigned char' mem loads into regs are neither sign nor zero extended. I've tried doing the following to enforce sign extension of memory loads SImode: Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md(revision 205971) +++ gcc/config/sh/sh.md(working copy) @@ -5958,7 +5958,18 @@ (define_expand zero_extendmodesi2 [(set (match_operand:SI 0 arith_reg_dest) -(zero_extend:SI (match_operand:QIHI 1 zero_extend_operand)))]) +(zero_extend:SI (match_operand:QIHI 1 general_extend_operand)))] + +{ + if (!zero_extend_operand (operands[1], MODEmode)) +{ + rtx tmp = gen_reg_rtx (SImode); + emit_insn (gen_extendmodesi2 (tmp, operands[1])); + emit_insn (gen_zero_extendmodesi2 (operands[0], + gen_lowpart (MODEmode, tmp))); + DONE; +} +}) (define_insn_and_split *zero_extendmodesi2_compact [(set (match_operand:SI 0 arith_reg_dest =r) However, this doesn't fix the problem. According to CSiBE (-m4 -ml -O2 -mpretend-cmove) there are a few cases where register allocation is a bit better, but there are also some code size increases (e.g. interference with the tst #imm,r0 patterns). There's a code size decrease of 228 bytes on the whole set. Nevertheless, having the explicit sign_extend mem loads could be useful. For example knowing that a mem load sign extends the cmpeq insn could be hoisted above the extension insns before register allocation. On SH2A it's probably better to not allow zero extending mem loads in the expander and defer the movu.{b|w} insn selection until the combine pass. Otherwise the original test case will always use zero extending mem loads, even though sign extending ones would suffice (16 bit insns vs 32 bit insns).
[Bug target/53987] [SH] Unnecessary zero-extension before cmp/eq
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987 --- Comment #2 from Oleg Endo olegendo at gcc dot gnu.org --- As of rev 204180 (4.9) this problem still exists. As far as I understand, the actual root of the problem is that the 'unsigned char' mem loads into regs are neither sign nor zero extended. So combine sees the following: (insn 11 6 12 2 (set (reg:QI 169 [ *a_3(D) ]) (mem:QI (reg:SI 4 r4 [ a ]) [0 *a_3(D)+0 S1 A8])) {*movqi} (expr_list:REG_DEAD (reg:SI 4 r4 [ a ]) (nil))) (insn 12 11 13 2 (set (reg:SI 168 [ *a_3(D)+-3 ]) (zero_extend:SI (reg:QI 169 [ *a_3(D) ]))) {*zero_extendqisi2_compact} (expr_list:REG_DEAD (reg:QI 169 [ *a_3(D) ]) (nil))) (insn 13 12 14 2 (set (reg:QI 171 [ *b_5(D) ]) (mem:QI (reg:SI 5 r5 [ b ]) [0 *b_5(D)+0 S1 A8])) {*movqi} (expr_list:REG_DEAD (reg:SI 5 r5 [ b ]) (nil))) (insn 14 13 15 2 (set (reg:SI 170 [ *b_5(D)+-3 ]) (zero_extend:SI (reg:QI 171 [ *b_5(D) ]))) {*zero_extendqisi2_compact} (expr_list:REG_DEAD (reg:QI 171 [ *b_5(D) ]) (nil))) (insn 15 14 16 2 (set (reg:SI 147 t) (eq:SI (reg:SI 168 [ *a_3(D)+-3 ]) (reg:SI 170 [ *b_5(D)+-3 ]))) {cmpeqsi_t} (expr_list:REG_DEAD (reg:SI 170 [ *b_5(D)+-3 ]) (expr_list:REG_DEAD (reg:SI 168 [ *a_3(D)+-3 ]) (nil On the other hand, signed char mem loads are expanded as sign extending. Since LOAD_EXTEND_OP tells that any mem loads but SImode are sign extending, one could expect that all mem loads will be automatically expanded as such, which is not the case.
[Bug target/53987] [SH] Unnecessary zero-extension before cmp/eq
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987 --- Comment #1 from Oleg Endo olegendo at gcc dot gnu.org 2012-08-31 18:14:28 UTC --- On a second thought, it might be not safe to omit zero/sign extensions if values are compared after calculations, where the regs hold values SImode.