https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71475
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |wrong-code Target| |i?86-*-* Status|ASSIGNED |NEW Component|c |target Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot gnu.org Target Milestone|--- |6.2 Summary|Optimization of copying |[6/7 Regression] |into long double looses |Optimization of copying |bytes |into long double looses | |bytes --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- Ok, so we arrive (slowly) at MEM[(char * {ref-all})&d0] = 0x41414141414141414141414141414141; d0.3_6 = d0; d = d0.3_6; d.4_7 = d; printf ("after memcpy: %Lf\n", d.4_7); which is then optimized by CSE to MEM[(char * {ref-all})&d0] = 0x41414141414141414141414141414141; d = 4.35573826932891467758901725805789285479666446831741854231e+96; printf ("after memcpy: %Lf\n", 4.35573826932891467758901725805789285479666446831741854231e+96); and thus the issue must be that FP literal passing to printf is broken somehow. The first call is expanded to: ;; printf ("after memset: %Lf\n", d0.0_1); (insn 28 27 29 (set (mem:XF (pre_dec:DI (reg/f:DI 7 sp)) [1 S16 A128]) (reg:XF 99 [ d0.0_1 ])) t.c:11 -1 (expr_list:REG_ARGS_SIZE (const_int 16 [0x10]) (nil))) (insn 29 28 30 (set (reg:DI 5 di) (symbol_ref/f:DI ("*.LC0") [flags 0x2] <var_decl 0x7fa583c676c0 *.LC0>)) t.c:11 -1 (nil)) (insn 30 29 31 (set (reg:QI 0 ax) (const_int 0 [0])) t.c:11 -1 (nil)) (call_insn 31 30 0 (set (reg:SI 0 ax) (call (mem:QI (symbol_ref:DI ("printf") [flags 0x41] <function_decl 0x7fa583b25000 printf>) [0 __builtin_printf S1 A8]) (const_int 16 [0x10]))) t.c:11 -1 (expr_list:REG_CALL_DECL (symbol_ref:DI ("printf") [flags 0x41] <function_decl 0x7fa583b25000 printf>) (nil)) while the second: ;; printf ("after memcpy: %Lf\n", 4.35573826932891467758901725805789285479666446831741854231e+96); (insn 53 52 54 (set (reg:XF 116) (mem/u/c:XF (symbol_ref/u:DI ("*.LC2") [flags 0x2]) [1 S16 A128])) t.c:20 -1 (nil)) (insn 54 53 55 (set (mem:XF (pre_dec:DI (reg/f:DI 7 sp)) [1 S16 A128]) (reg:XF 116)) t.c:20 -1 (expr_list:REG_ARGS_SIZE (const_int 16 [0x10]) (nil))) (insn 55 54 56 (set (reg:DI 5 di) (symbol_ref/f:DI ("*.LC3") [flags 0x2] <var_decl 0x7fa583c677e0 *.LC3>)) t.c:20 -1 (nil)) (insn 56 55 57 (set (reg:QI 0 ax) (const_int 0 [0])) t.c:20 -1 (nil)) (call_insn 57 56 0 (set (reg:SI 0 ax) (call (mem:QI (symbol_ref:DI ("printf") [flags 0x41] <function_decl 0x7fa583b25000 printf>) [0 __builtin_printf S1 A8]) (const_int 16 [0x10]))) t.c:20 -1 (expr_list:REG_CALL_DECL (symbol_ref:DI ("printf") [flags 0x41] <function_decl 0x7fa583b25000 printf>) (nil)) with assembly: movabsq $4702111234474983745, %rax ... movq %rax, 16(%rsp) movq %rax, 24(%rsp) xorl %eax, %eax fldt 16(%rsp) fld %st(0) fstpt 32(%rsp) fstpt (%rsp) call printf vs. fldt .LC2(%rip) movabsq $4702111234474983745, %rdx movabsq $4702111234474983745, %rax subq $16, %rsp .cfi_def_cfa_offset 80 movq %rax, 16(%rsp) movq %rdx, 24(%rsp) movl $.LC3, %edi xorl %eax, %eax fld %st(0) fstpt 32(%rsp) fstpt (%rsp) call printf So confirmed, but it looks like an RTL optimization / target (regstack related) issue. The testcase itself is a GCC 6 regression but I guess that using a FP literal (if one can express that value...) would produce the same issue?