https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71374
Uroš Bizjak <ubizjak at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |ra Status|UNCONFIRMED |NEW Last reconfirmed| |2016-06-02 CC| |vmakarov at gcc dot gnu.org Component|target |rtl-optimization Ever confirmed|0 |1 --- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> --- This is register allocator failure, as evident from even more simplified testcase: int a, b, c; extern inline void fn1 (void *p1, void *p2) { __asm__ ("#": "=&c" (a), "=&D" (b), "=&S" (c): "r" (p2), "2" (p2)); } LRA gets following RTX: (insn 10 4 7 2 (parallel [ (set (reg:SI 89) (asm_operands:SI ("#") ("=&c") 0 [ (reg/v/f:DI 88 [ p2 ]) (reg/v/f:DI 88 [ p2 ]) ] [ (asm_input:DI ("r") t.c:4) (asm_input:DI ("2") t.c:4) ] [] t.c:4)) (set (reg:SI 90) (asm_operands:SI ("#") ("=&D") 1 [ (reg/v/f:DI 88 [ p2 ]) (reg/v/f:DI 88 [ p2 ]) ] [ (asm_input:DI ("r") t.c:4) (asm_input:DI ("2") t.c:4) ] [] t.c:4)) (set (reg:SI 91) (asm_operands:SI ("#") ("=&S") 2 [ (reg/v/f:DI 88 [ p2 ]) (reg/v/f:DI 88 [ p2 ]) ] [ (asm_input:DI ("r") t.c:4) (asm_input:DI ("2") t.c:4) ] [] t.c:4)) (clobber (reg:CCFP 18 fpsr)) (clobber (reg:CC 17 flags)) ]) t.c:4 -1 (expr_list:REG_DEAD (reg/v/f:DI 88 [ p2 ]) (expr_list:REG_UNUSED (reg:CCFP 18 fpsr) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil))))) Please note how asm input is tied through p2 variable. LRA ties "2" matching constraint with "=&D" earlyclobber output constraint (BTW: matching earlyclobber output is allowed), but it can't resolve tie through p2. This results in: (insn 10 4 15 2 (parallel [ (set (reg:SI 2 cx [89]) (asm_operands:SI ("#") ("=&c") 0 [ (reg/v/f:DI 4 si [orig:88 p2 ] [88]) (reg/v/f:DI 4 si [orig:88 p2 ] [88]) ] [ (asm_input:DI ("r") t.c:4) (asm_input:DI ("2") t.c:4) ] [] t.c:4)) (set (reg:SI 5 di [90]) (asm_operands:SI ("#") ("=&D") 1 [ (reg/v/f:DI 4 si [orig:88 p2 ] [88]) (reg/v/f:DI 4 si [orig:88 p2 ] [88]) ] [ (asm_input:DI ("r") t.c:4) (asm_input:DI ("2") t.c:4) ] [] t.c:4)) (set (reg:SI 4 si [orig:88 p2 ] [88]) (asm_operands:SI ("#") ("=&S") 2 [ (reg/v/f:DI 4 si [orig:88 p2 ] [88]) (reg/v/f:DI 4 si [orig:88 p2 ] [88]) ] [ (asm_input:DI ("r") t.c:4) (asm_input:DI ("2") t.c:4) ] [] t.c:4)) (clobber (reg:CCFP 18 fpsr)) (clobber (reg:CC 17 flags)) ]) t.c:4 -1 (nil)) which results in reg SI allocated to asm input 0. This violates earlyclobber requirement that "this operand may not lie in a register that is read by the instruction or as part of any memory address" with output operand 2, which is also reg SI. LRA should copy asm input 0 to an appropriate class temporary reg in the above case. Confirmed as rtl-optimization problem.