https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81502
Marc Glisse <glisse at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed| |2017-07-21 Ever confirmed|0 |1 --- Comment #1 from Marc Glisse <glisse at gcc dot gnu.org> --- .optimized dump: int bar(void*) (void * ptr) { int res; __m128i word; long unsigned int _2; vector(2) long long int word.3_3; unsigned int _4; <bb 2> [100.00%] [count: INV]: _2 = (long unsigned int) ptr_9(D); word = { 0, 0 }; MEM[(char * {ref-all})&word] = _2; word.3_3 = word; word ={v} {CLOBBER}; _4 = BIT_FIELD_REF <word.3_3, 32, 0>; res_5 = (int) _4; return res_5; } We missed turning the memory write into a BIT_INSERT_EXPR, and passes like PRE missed following the bit_field_expr all the way to _2. .combine dump: [...] (insn 8 3 10 2 (set (reg/v:V2DI 90 [ word ]) (vec_concat:V2DI (reg/v/f:DI 92 [ ptr ]) (const_int 0 [0]))) "b.c":16 3712 {vec_concatv2di} (expr_list:REG_DEAD (reg/v/f:DI 92 [ ptr ]) (nil))) (insn 10 8 15 2 (set (reg:SI 94 [ res ]) (vec_select:SI (subreg:V4SI (reg/v:V2DI 90 [ word ]) 0) (parallel [ (const_int 0 [0]) ]))) "b.c":20 3697 {*vec_extractv4si_0} (expr_list:REG_DEAD (reg/v:V2DI 90 [ word ]) (nil))) [...] combine tries (set (reg:SI 94 [ res ]) (vec_select:SI (subreg:V4SI (vec_concat:V2DI (reg/v/f:DI 92 [ ptr ]) (const_int 0 [0])) 0) (parallel [ (const_int 0 [0]) ]))) which we fail to simplify. The xmm1-xmm0 mov is not considered a mov by the compiler but concatenation with 0, so not a RA problem. The change of mode (64-bit pointer to 32-bit int) seems to play a big role in confusing things here.