It's nice to use the processors vector arithmetic to good effect, but it's all for naught when there are too many moves from/to general registers cluttering up the loop. With a double-vector reduction variable, the standard final reduction code got so awkward that the register allocator decided that the reduction variable must live in general purpose
registers, not only after the loop, but across the loop patch.
Splitting the reduction to force the first step to be done as a vector operation
seemed the obvious solution. The hook was called, but the vectorizer still
generated the vanilla final reduction code. It turns out that the reduction splitting
was calculated, but the result not used, and the calculation started anew.

The attached patch fixes this.

bootstrapped and regression tested on x86_64-pc-linux-gnu .
2018-10-31  Joern Rennecke  <joern.renne...@riscy-ip.com>

        * tree-vect-loop.c (vect_create_epilog_for_reduction):
        If we split the reduction, use the result in Case 3 too.

Index: tree-vect-loop.c
===================================================================
--- tree-vect-loop.c    (revision 266008)
+++ tree-vect-loop.c    (working copy)
@@ -5139,6 +5139,7 @@ vect_create_epilog_for_reduction (vec<tree> vect_d
       while (sz > sz1)
        {
          gcc_assert (!slp_reduc);
+         gcc_assert (new_phis.length () == 1);
          sz /= 2;
          vectype1 = get_vectype_for_scalar_type_and_size (scalar_type, sz);
 
@@ -5301,7 +5302,12 @@ vect_create_epilog_for_reduction (vec<tree> vect_d
           FOR_EACH_VEC_ELT (new_phis, i, new_phi)
             {
               int bit_offset;
-              if (gimple_code (new_phi) == GIMPLE_PHI)
+
+             /* If we did reduction splitting, make sure to use the result.  */
+             if (!slp_reduc && new_phis.length () == 1
+                 && new_temp != new_phi_result)
+               vec_temp = new_temp;
+             else if (gimple_code (new_phi) == GIMPLE_PHI)
                 vec_temp = PHI_RESULT (new_phi);
               else
                 vec_temp = gimple_assign_lhs (new_phi);

Reply via email to