https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101200
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org, | |law at gcc dot gnu.org --- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Doing everything in DImode or SImode has some advantages, but lots of disadvantages too. x86 can do 8/16/32/64bit arithmetics just fine (although 16bit can cause some performance problems). The second & 15 from the source is optimized away during GIMPLE operations. But later on reappears from just a zero extension from 8 bits to 64 bits, with a few bits known to be zero (so not & 255 but just & 15). I think the right pass to optimize this back is REE. If I manually undo the optimization that combine did and replace (insn 14 13 15 2 (parallel [ (set (reg:DI 95 [ a ]) (and:DI (subreg:DI (reg:QI 93 [ a ]) 0) (const_int 15 [0xf]))) (clobber (reg:CC 17 flags)) ]) "pr101200.c":9:10 494 {*anddi_1} with (insn 14 13 15 2 (set (reg:DI 95 [ a ]) (zero_extend:DI (reg:QI 93 [ a ]))) "pr101200.c":9:10 137 {zero_extendqidi2} then REE tries something but gives up anyway: Trying to eliminate extension: (insn 14 13 15 2 (set (reg:DI 0 ax [orig:95 a ] [95]) (zero_extend:DI (reg:QI 0 ax [orig:93 a ] [93]))) "pr101200.c":9:10 137 {zero_extendqidi2} (nil)) Tentatively merged extension with definition : (insn 12 10 13 2 (parallel [ (set (reg:DI 0 ax) (zero_extend:DI (lshiftrt:QI (reg:QI 0 ax [orig:82 d.0_1 ] [82]) (const_int 4 [0x4])))) (clobber (reg:CC 17 flags)) ]) "pr101200.c":8:17 -1 (nil)) Merge cancelled, non-mergeable definitions: (insn 12 10 13 2 (parallel [ (set (reg:QI 0 ax [orig:93 a ] [93]) (lshiftrt:QI (reg:QI 0 ax [orig:82 d.0_1 ] [82]) (const_int 4 [0x4]))) (clobber (reg:CC 17 flags)) ]) "pr101200.c":8:17 717 {*lshrqi3_1} (nil)) Elimination opportunities = 1 realized = 0 So, to handle this, REE would need to figure out that some ANDs following logical right shifts can be treated as zero extensions too and also be taught to handle lshiftrts, replace the QImode MEM read with zero extending one, widening the right shift from QImode to DImode too (or ideally to SImode given the behavior of x86) and optimizing away the zero extension (emitted as AND 15).