http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987
--- Comment #3 from Oleg Endo <olegendo at gcc dot gnu.org> --- (In reply to Oleg Endo from comment #2) > As of rev 204180 (4.9) this problem still exists. > As far as I understand, the actual root of the problem is that the 'unsigned > char' mem loads into regs are neither sign nor zero extended. I've tried doing the following to enforce sign extension of memory loads < SImode: Index: gcc/config/sh/sh.md =================================================================== --- gcc/config/sh/sh.md (revision 205971) +++ gcc/config/sh/sh.md (working copy) @@ -5958,7 +5958,18 @@ (define_expand "zero_extend<mode>si2" [(set (match_operand:SI 0 "arith_reg_dest") - (zero_extend:SI (match_operand:QIHI 1 "zero_extend_operand")))]) + (zero_extend:SI (match_operand:QIHI 1 "general_extend_operand")))] + "" +{ + if (!zero_extend_operand (operands[1], <MODE>mode)) + { + rtx tmp = gen_reg_rtx (SImode); + emit_insn (gen_extend<mode>si2 (tmp, operands[1])); + emit_insn (gen_zero_extend<mode>si2 (operands[0], + gen_lowpart (<MODE>mode, tmp))); + DONE; + } +}) (define_insn_and_split "*zero_extend<mode>si2_compact" [(set (match_operand:SI 0 "arith_reg_dest" "=r") However, this doesn't fix the problem. According to CSiBE (-m4 -ml -O2 -mpretend-cmove) there are a few cases where register allocation is a bit better, but there are also some code size increases (e.g. interference with the tst #imm,r0 patterns). There's a code size decrease of 228 bytes on the whole set. Nevertheless, having the explicit sign_extend mem loads could be useful. For example knowing that a mem load sign extends the cmpeq insn could be hoisted above the extension insns before register allocation. On SH2A it's probably better to not allow zero extending mem loads in the expander and defer the movu.{b|w} insn selection until the combine pass. Otherwise the original test case will always use zero extending mem loads, even though sign extending ones would suffice (16 bit insns vs 32 bit insns).