Based on PR 34683 and related to PR 34705.

Richard writes there (text is slightly edited by me):

The problem is we have loads of unpartitionable SFTs. The FE generated IL
causes the first alias pass to emit too conservative
alias info as well:

  # VUSE <MPT.5965_208230>
  D.6049_4652 = atmp.1110.data;
  D.6050_4653 = (complex(kind=4)[0:] *) D.6049_4652;
  D.6051_4654 = S.1113_289 + D.3340_4634;
  # VUSE <SFT.......>
  D.6052_4655 = (*D.6050_4653)[D.6051_4654];

I suppose the atmp.1110.data type is something like (void *), so the
cast is required.  But this really pessimizes the middle-end IL and
it looks like the FE knows it will be complex(kind=4)[4] at the point
of creation.  Note fixing this will also improve optimization and thus
runtime performance.

I also see the FE creates lots of array temporaries:

  struct array2_complex(kind=4) atmp.1093;
  complex(kind=4) A.1094[4];
  struct array2_complex(kind=4) atmp.1095;
  complex(kind=4) A.1096[4];
  struct array2_complex(kind=4) atmp.1100;
  complex(kind=4) A.1101[4];
  struct array2_complex(kind=4) atmp.1102;
  complex(kind=4) A.1103[4];
  struct array2_complex(kind=4) atmp.1106;
  complex(kind=4) A.1107[4];
  real(kind=4) D.3326;
...

instead of re-using a single one.  This also causes internal representation
to blow up.

So, to sum up, the situation could be significantly improved by improving
the FE.


For array temporaries the pinning of SFTs happens because we have the address
of the actual array _data_ in the IL:

  # SFT.10_52 = VDEF <SFT.10_51(D)>
  atmp.0.data = &A.1;

the array descriptor itself is not the problem (the redundant descriptors still
consume memory, but should not cause compile-time problems where observed).

As of optimization, the conversion sequence

  atmp.0.data = &A.1;
  D.540_5 = atmp.0.data;
  D.541_6 = (real(kind=4)[0:] *) D.540_5;
...
  (*D.541_6)[S.5_1] = D.545_14;

can be optimized to

  A.1[S.5_1] = D.545_14;

with the following patch that makes sure we use (real(kind=4)[4] *) instead
of the unknown size array type.

Index: trans-types.c
===================================================================
--- trans-types.c       (revision 131336)
+++ trans-types.c       (working copy)
@@ -1511,10 +1511,12 @@ gfc_get_array_type_bounds (tree etype, i
   /* TODO: known offsets for descriptors.  */
   GFC_TYPE_ARRAY_OFFSET (fat_type) = NULL_TREE;

-  /* We define data as an unknown size array. Much better than doing
+  /* We define data as an array with the correct size. Much better than doing
      pointer arithmetic.  */
   arraytype =
-    build_array_type (etype, gfc_array_range_type);
+    build_array_type (etype, build_range_type (gfc_array_index_type,
+       gfc_index_zero_node, int_const_binop (MINUS_EXPR, stride,
+       integer_one_node, 0)));
   arraytype = build_pointer_type (arraytype);
   GFC_TYPE_ARRAY_DATAPTR_TYPE (fat_type) = arraytype;

(the patch needs to be adjusted for the cases stride is not the actual array
size, but you should get the idea)


-- 
           Summary: FE should reuse array temporaries, reduce temporaties
                    and tell ME the array-size type
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: fortran
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: burnus at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34706

Reply via email to