[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX

2013-04-03 Thread rguenth at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560



Richard Biener rguenth at gcc dot gnu.org changed:



   What|Removed |Added



   Priority|P3  |P2


[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX

2013-03-12 Thread ebotcazou at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560



--- Comment #7 from Eric Botcazou ebotcazou at gcc dot gnu.org 2013-03-12 
11:37:27 UTC ---

 This patch adds expand_args to track library calls to

 expend arguments.  We add vzeroupper when expand_args == 1,

 which indicates we are expanding the actual function.



This looks complicated.  A simpler approach could be to record the AVX state in

the CUMULATIVE_ARGS structure and transfer it to cfun-machine only at the end

of the argument processing.  Comments and code in calls.c appear to guarantee

that



  targetm.calls.function_arg (args_so_far, VOIDmode, void_type_node, true)



is invoked immediately before the call instruction is emitted.


[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX

2013-03-12 Thread hjl.tools at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560



H.J. Lu hjl.tools at gmail dot com changed:



   What|Removed |Added



  Attachment #29645|0   |1

is obsolete||



--- Comment #8 from H.J. Lu hjl.tools at gmail dot com 2013-03-12 16:48:45 
UTC ---

Created attachment 29655

  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29655

A patch



This patch adds callee_pass_avx256_p and callee_return_avx256_p

to ix86_args.  ix86_function_arg copies them to cfun-machine

when ix86_function_arg is called with VOIDmode, which is called

just before emitting call.  cfun-machine-callee_return_avx256_p

is set in init_cumulative_args for ix86_function_ok_for_sibcall.


[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX

2013-03-12 Thread ebotcazou at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560



--- Comment #9 from Eric Botcazou ebotcazou at gcc dot gnu.org 2013-03-12 
17:07:40 UTC ---

 This patch adds callee_pass_avx256_p and callee_return_avx256_p

 to ix86_args.  ix86_function_arg copies them to cfun-machine

 when ix86_function_arg is called with VOIDmode, which is called

 just before emitting call.  cfun-machine-callee_return_avx256_p

 is set in init_cumulative_args for ix86_function_ok_for_sibcall.



This looks good to me, but I don't know the i386 back-end much (and of course

cannot approve anything).  Btw, you should add a comment before the new



  if (cum-caller  mode == VOIDmode)



block explaining why you need to do this dance on the caller side.



Thanks for working on this.


[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX

2013-03-11 Thread hjl.tools at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560



--- Comment #6 from H.J. Lu hjl.tools at gmail dot com 2013-03-11 19:34:29 
UTC ---

Created attachment 29645

  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29645

A patch



This patch adds expand_args to track library calls to

expend arguments.  We add vzeroupper when expand_args == 1,

which indicates we are expanding the actual function.


[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX

2013-03-10 Thread ebotcazou at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560



--- Comment #5 from Eric Botcazou ebotcazou at gcc dot gnu.org 2013-03-10 
15:51:03 UTC ---

 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c

 index c1f6c88..8005207 100644

 --- a/gcc/config/i386/i386.c

 +++ b/gcc/config/i386/i386.c

 @@ -5562,7 +5562,7 @@ init_cumulative_args (CUMULATIVE_ARGS *cum,  /* Argument

 info to initialize */

memset (cum, 0, sizeof (*cum));

 

/* Initialize for the current callee.  */

 -  if (caller)

 +  if (caller  fndecl)

  {

cfun-machine-callee_pass_avx256_p = false;

cfun-machine-callee_return_avx256_p = false;

 

 fixes it.



I don't think it's correct, fndecl is NULL_TREE for indirect calls as well.


[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX

2013-03-08 Thread ubizjak at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560



Uros Bizjak ubizjak at gmail dot com changed:



   What|Removed |Added



 CC||hjl.tools at gmail dot com

  Known to work||4.8.0



--- Comment #3 from Uros Bizjak ubizjak at gmail dot com 2013-03-08 09:24:27 
UTC ---

Adding author of 4,6/4.7 vzerouopper pass to CC.


[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX

2013-03-08 Thread hjl.tools at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560



--- Comment #4 from H.J. Lu hjl.tools at gmail dot com 2013-03-08 17:21:18 
UTC ---

The caller info is lost by



(gdb) bt

#0  init_cumulative_args (cum=0x7fffc3f0, fntype=0x71472e70, 

libname=0x0, fndecl=0x0, caller=1)

at /export/gnu/import/git/gcc-release/gcc/config/i386/i386.c:5562

#1  0x00640011 in block_move_libcall_safe_for_call_parm ()

at /export/gnu/import/git/gcc-release/gcc/expr.c:1244

#2  0x0063fc07 in emit_block_move_hints (x=0x71470780, 

y=0x71470750, size=0x7133a530, method=BLOCK_OP_CALL_PARM, 

expected_align=0, expected_size=-1)

at /export/gnu/import/git/gcc-release/gcc/expr.c:1139

#3  0x0063ff3c in emit_block_move (x=0x71470780, y=0x71470750, 

size=0x7133a530, method=BLOCK_OP_CALL_PARM)

at /export/gnu/import/git/gcc-release/gcc/expr.c:1206

#4  0x0064693a in emit_push_insn (x=0x71470750, mode=BLKmode, 

type=0x71472690, size=0x7133a530, align=64, partial=0, reg=0x0, 

extra=4, args_addr=0x71334560, args_so_far=0x7133a470, 

reg_parm_stack_space=0, alignment_pad=0x7133a470)

at /export/gnu/import/git/gcc-release/gcc/expr.c:4116

#5  0x0056d1ad in store_one_arg (arg=0x7fffc760, 

argblock=0x71334560, flags=0, variable_size=0, reg_parm_stack_space=0)

at /export/gnu/import/git/gcc-release/gcc/calls.c:4646

#6  0x00568b51 in expand_call (exp=0x71333cb0, 

target=0x71461f00, ignore=0)

at /export/gnu/import/git/gcc-release/gcc/calls.c:3023



when storing struct S on stack.  This patch:



diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c

index c1f6c88..8005207 100644

--- a/gcc/config/i386/i386.c

+++ b/gcc/config/i386/i386.c

@@ -5562,7 +5562,7 @@ init_cumulative_args (CUMULATIVE_ARGS *cum,  /* Argument

info to initialize */

   memset (cum, 0, sizeof (*cum));



   /* Initialize for the current callee.  */

-  if (caller)

+  if (caller  fndecl)

 {

   cfun-machine-callee_pass_avx256_p = false;

   cfun-machine-callee_return_avx256_p = false;



fixes it.


[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX

2013-03-07 Thread rguenth at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560



Richard Biener rguenth at gcc dot gnu.org changed:



   What|Removed |Added



 Status|UNCONFIRMED |NEW

   Last reconfirmed||2013-03-07

  Known to work||4.5.3

   Target Milestone|--- |4.6.4

Summary|[4.7 regression] vzeroupper |[4.6/4.7 regression]

   |clobbers argument with AVX  |vzeroupper clobbers

   ||argument with AVX

 Ever Confirmed|0   |1

  Known to fail||4.6.3, 4.7.2



--- Comment #1 from Richard Biener rguenth at gcc dot gnu.org 2013-03-07 
10:09:43 UTC ---

Confirmed, same code generated on the 4.6 branch.  Works on the 4.5 branch

where no vzeroupper is inserted.  Likewise no vzeroupper on trunk.


[Bug target/56560] [4.6/4.7 regression] vzeroupper clobbers argument with AVX

2013-03-07 Thread ebotcazou at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56560



--- Comment #2 from Eric Botcazou ebotcazou at gcc dot gnu.org 2013-03-07 
10:17:56 UTC ---

 Confirmed, same code generated on the 4.6 branch.  Works on the 4.5 branch

 where no vzeroupper is inserted.  Likewise no vzeroupper on trunk.



Thanks for confirming.  The vzeroupper pass didn't exist on the 4.5 branch and

has been rewritten to use the mode-switching machinery on mainline.