https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89965
--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> --- So, we have: (insn 41 40 42 6 (parallel [ (set (reg/v:DI 101 [ i ]) (lshiftrt:DI (reg/v:DI 118 [ i ]) (const_int 7 [0x7]))) (clobber (reg:CC 17 flags)) ]) "pr89965.c":9 574 {*lshrdi3_doubleword} (expr_list:REG_DEAD (reg/v:DI 118 [ i ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_EQUIV (mem:DI (reg/f:SI 7 sp) [0 S8 A32]) (nil))))) (insn 42 41 43 6 (set (reg:QI 89 [ _3 ]) (subreg:QI (reg/v:DI 101 [ i ]) 0)) "pr89965.c":10 88 {*movqi_internal} (nil)) (insn 43 42 44 6 (parallel [ (set (reg/v:QI 102 [ c ]) (mult:QI (reg:QI 89 [ _3 ]) (subreg:QI (reg:SI 111 [ c ]) 0))) (clobber (reg:CC 17 flags)) ]) "pr89965.c":10 351 {*mulqi3_1} (expr_list:REG_DEAD (reg:SI 111 [ c ]) (expr_list:REG_DEAD (reg:QI 89 [ _3 ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil))))) (insn 44 43 45 6 (set (mem:DI (plus:SI (reg/f:SI 7 sp) (const_int 8 [0x8])) [0 S8 A32]) (const_int 12 [0xc])) "pr89965.c":11 85 {*movdi_internal} (nil)) (insn 45 44 46 6 (set (mem:DI (reg/f:SI 7 sp) [0 S8 A32]) (reg/v:DI 101 [ i ])) "pr89965.c":11 85 {*movdi_internal} (expr_list:REG_DEAD (reg/v:DI 101 [ i ]) (nil))) (call_insn/u 46 45 49 6 (set (reg:DI 0 ax) (call (mem:QI (symbol_ref:SI ("__udivdi3") [flags 0x41]) [0 S1 A8]) (const_int 16 [0x10]))) "pr89965.c":11 699 {*call_value} (expr_list:REG_UNUSED (reg:DI 0 ax) (expr_list:REG_EH_REGION (const_int -2147483648 [0xffffffff80000000]) (nil))) (expr_list (use (mem:DI (reg/f:SI 7 sp) [0 S8 A8])) (expr_list (use (mem:DI (plus:SI (reg/f:SI 7 sp) (const_int 8 [0x8])) [0 S8 A8])) (nil)))) before RA (and the __udivdi3 call is actually dead - ax after it is not used. Note the result of i >> 7 is first used in the multiplication and later stored into the argument slot of the call. Now, RA decides for some reason to first push the i >> 7 into the stack slot and then load the single byte from it for the purpose of the multiplication: (insn 41 116 118 6 (parallel [ (set (reg/v:DI 0 ax [orig:101 i ] [101]) (lshiftrt:DI (reg/v:DI 0 ax [orig:101 i ] [101]) (const_int 7 [0x7]))) (clobber (reg:CC 17 flags)) ]) "pr89965.c":9 574 {*lshrdi3_doubleword} (expr_list:REG_EQUIV (mem:DI (reg/f:SI 7 sp) [0 S8 A32]) (nil))) (insn 118 41 42 6 (set (mem:DI (reg/f:SI 7 sp) [0 S8 A32]) (reg/v:DI 0 ax [orig:101 i ] [101])) "pr89965.c":9 85 {*movdi_internal} (nil)) (insn 42 118 119 6 (set (reg:QI 6 bp [orig:89 _3 ] [89]) (mem:QI (reg/f:SI 7 sp) [0 S1 A32])) "pr89965.c":10 88 {*movqi_internal} (nil)) ... Finally, rtl_dce pass has code to DCE not just dead const/pure calls, but also their arguments, but unfortunately that code (find_call_stack_args) doesn't seem to take into account that some code might read again from those arguments (rather than only the call reading from those stack slots). So I guess the question is, is what the RA did above ok? If yes, I think find_call_stack_args needs to be changed to FOR_EACH_SUBRTX NONCONST walk the SET_SRC (set) and if it finds a MEM that is argument slot, either punt immediately, or remove those bits from sp_bytes, so that corresponding store won't be set in arg_stores and we'll punt on that store.