Hello!

The problem was with the ordering of vzeroupper removal pass and
pad-return pass, both in mach pass. Attached patch changes pass
ordering so vzeroupper removal is run before pad-return pass.
Pad-return pass then (correctly) finds empty function and emits long
return.

2011-05-04  Uros Bizjak  <ubiz...@gmail.com>

        * config/i386/i386.c (ix86_reorg): Run move_or_delete_vzeroupper first.

Tested on x86_64-pc-linux-gnu {,-m32} AVX target, committed to mainline SVN.

Uros.
Index: i386.c
===================================================================
--- i386.c      (revision 173376)
+++ i386.c      (working copy)
@@ -30444,6 +30444,10 @@ ix86_reorg (void)
      with old MDEP_REORGS that are not CFG based.  Recompute it now.  */
   compute_bb_for_insn ();
 
+  /* Run the vzeroupper optimization if needed.  */
+  if (TARGET_VZEROUPPER)
+    move_or_delete_vzeroupper ();
+
   if (optimize && optimize_function_for_speed_p (cfun))
     {
       if (TARGET_PAD_SHORT_FUNCTION)
@@ -30455,10 +30459,6 @@ ix86_reorg (void)
        ix86_avoid_jump_mispredicts ();
 #endif
     }
-
-  /* Run the vzeroupper optimization if needed.  */
-  if (TARGET_VZEROUPPER)
-    move_or_delete_vzeroupper ();
 }
 
 /* Return nonzero when QImode register that must be represented via REX prefix

Reply via email to