[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Priority|P3 |P2
[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560 --- Comment #7 from Eric Botcazou ebotcazou at gcc dot gnu.org 2013-03-12 11:37:27 UTC --- This patch adds expand_args to track library calls to expend arguments. We add vzeroupper when expand_args == 1, which indicates we are expanding the actual function. This looks complicated. A simpler approach could be to record the AVX state in the CUMULATIVE_ARGS structure and transfer it to cfun-machine only at the end of the argument processing. Comments and code in calls.c appear to guarantee that targetm.calls.function_arg (args_so_far, VOIDmode, void_type_node, true) is invoked immediately before the call instruction is emitted.
[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560 H.J. Lu hjl.tools at gmail dot com changed: What|Removed |Added Attachment #29645|0 |1 is obsolete|| --- Comment #8 from H.J. Lu hjl.tools at gmail dot com 2013-03-12 16:48:45 UTC --- Created attachment 29655 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29655 A patch This patch adds callee_pass_avx256_p and callee_return_avx256_p to ix86_args. ix86_function_arg copies them to cfun-machine when ix86_function_arg is called with VOIDmode, which is called just before emitting call. cfun-machine-callee_return_avx256_p is set in init_cumulative_args for ix86_function_ok_for_sibcall.
[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560 --- Comment #9 from Eric Botcazou ebotcazou at gcc dot gnu.org 2013-03-12 17:07:40 UTC --- This patch adds callee_pass_avx256_p and callee_return_avx256_p to ix86_args. ix86_function_arg copies them to cfun-machine when ix86_function_arg is called with VOIDmode, which is called just before emitting call. cfun-machine-callee_return_avx256_p is set in init_cumulative_args for ix86_function_ok_for_sibcall. This looks good to me, but I don't know the i386 back-end much (and of course cannot approve anything). Btw, you should add a comment before the new if (cum-caller mode == VOIDmode) block explaining why you need to do this dance on the caller side. Thanks for working on this.
[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560 --- Comment #6 from H.J. Lu hjl.tools at gmail dot com 2013-03-11 19:34:29 UTC --- Created attachment 29645 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29645 A patch This patch adds expand_args to track library calls to expend arguments. We add vzeroupper when expand_args == 1, which indicates we are expanding the actual function.
[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560 --- Comment #5 from Eric Botcazou ebotcazou at gcc dot gnu.org 2013-03-10 15:51:03 UTC --- diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index c1f6c88..8005207 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -5562,7 +5562,7 @@ init_cumulative_args (CUMULATIVE_ARGS *cum, /* Argument info to initialize */ memset (cum, 0, sizeof (*cum)); /* Initialize for the current callee. */ - if (caller) + if (caller fndecl) { cfun-machine-callee_pass_avx256_p = false; cfun-machine-callee_return_avx256_p = false; fixes it. I don't think it's correct, fndecl is NULL_TREE for indirect calls as well.
[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560 Uros Bizjak ubizjak at gmail dot com changed: What|Removed |Added CC||hjl.tools at gmail dot com Known to work||4.8.0 --- Comment #3 from Uros Bizjak ubizjak at gmail dot com 2013-03-08 09:24:27 UTC --- Adding author of 4,6/4.7 vzerouopper pass to CC.
[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560 --- Comment #4 from H.J. Lu hjl.tools at gmail dot com 2013-03-08 17:21:18 UTC --- The caller info is lost by (gdb) bt #0 init_cumulative_args (cum=0x7fffc3f0, fntype=0x71472e70, libname=0x0, fndecl=0x0, caller=1) at /export/gnu/import/git/gcc-release/gcc/config/i386/i386.c:5562 #1 0x00640011 in block_move_libcall_safe_for_call_parm () at /export/gnu/import/git/gcc-release/gcc/expr.c:1244 #2 0x0063fc07 in emit_block_move_hints (x=0x71470780, y=0x71470750, size=0x7133a530, method=BLOCK_OP_CALL_PARM, expected_align=0, expected_size=-1) at /export/gnu/import/git/gcc-release/gcc/expr.c:1139 #3 0x0063ff3c in emit_block_move (x=0x71470780, y=0x71470750, size=0x7133a530, method=BLOCK_OP_CALL_PARM) at /export/gnu/import/git/gcc-release/gcc/expr.c:1206 #4 0x0064693a in emit_push_insn (x=0x71470750, mode=BLKmode, type=0x71472690, size=0x7133a530, align=64, partial=0, reg=0x0, extra=4, args_addr=0x71334560, args_so_far=0x7133a470, reg_parm_stack_space=0, alignment_pad=0x7133a470) at /export/gnu/import/git/gcc-release/gcc/expr.c:4116 #5 0x0056d1ad in store_one_arg (arg=0x7fffc760, argblock=0x71334560, flags=0, variable_size=0, reg_parm_stack_space=0) at /export/gnu/import/git/gcc-release/gcc/calls.c:4646 #6 0x00568b51 in expand_call (exp=0x71333cb0, target=0x71461f00, ignore=0) at /export/gnu/import/git/gcc-release/gcc/calls.c:3023 when storing struct S on stack. This patch: diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index c1f6c88..8005207 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -5562,7 +5562,7 @@ init_cumulative_args (CUMULATIVE_ARGS *cum, /* Argument info to initialize */ memset (cum, 0, sizeof (*cum)); /* Initialize for the current callee. */ - if (caller) + if (caller fndecl) { cfun-machine-callee_pass_avx256_p = false; cfun-machine-callee_return_avx256_p = false; fixes it.
[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2013-03-07 Known to work||4.5.3 Target Milestone|--- |4.6.4 Summary|[4.7 regression] vzeroupper |[4.6/4.7 regression] |clobbers argument with AVX |vzeroupper clobbers ||argument with AVX Ever Confirmed|0 |1 Known to fail||4.6.3, 4.7.2 --- Comment #1 from Richard Biener rguenth at gcc dot gnu.org 2013-03-07 10:09:43 UTC --- Confirmed, same code generated on the 4.6 branch. Works on the 4.5 branch where no vzeroupper is inserted. Likewise no vzeroupper on trunk.
[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560 --- Comment #2 from Eric Botcazou ebotcazou at gcc dot gnu.org 2013-03-07 10:17:56 UTC --- Confirmed, same code generated on the 4.6 branch. Works on the 4.5 branch where no vzeroupper is inserted. Likewise no vzeroupper on trunk. Thanks for confirming. The vzeroupper pass didn't exist on the 4.5 branch and has been rewritten to use the mode-switching machinery on mainline.