Based on PR 34683 and related to PR 34705. Richard writes there (text is slightly edited by me):
The problem is we have loads of unpartitionable SFTs. The FE generated IL causes the first alias pass to emit too conservative alias info as well: # VUSE <MPT.5965_208230> D.6049_4652 = atmp.1110.data; D.6050_4653 = (complex(kind=4)[0:] *) D.6049_4652; D.6051_4654 = S.1113_289 + D.3340_4634; # VUSE <SFT.......> D.6052_4655 = (*D.6050_4653)[D.6051_4654]; I suppose the atmp.1110.data type is something like (void *), so the cast is required. But this really pessimizes the middle-end IL and it looks like the FE knows it will be complex(kind=4)[4] at the point of creation. Note fixing this will also improve optimization and thus runtime performance. I also see the FE creates lots of array temporaries: struct array2_complex(kind=4) atmp.1093; complex(kind=4) A.1094[4]; struct array2_complex(kind=4) atmp.1095; complex(kind=4) A.1096[4]; struct array2_complex(kind=4) atmp.1100; complex(kind=4) A.1101[4]; struct array2_complex(kind=4) atmp.1102; complex(kind=4) A.1103[4]; struct array2_complex(kind=4) atmp.1106; complex(kind=4) A.1107[4]; real(kind=4) D.3326; ... instead of re-using a single one. This also causes internal representation to blow up. So, to sum up, the situation could be significantly improved by improving the FE. For array temporaries the pinning of SFTs happens because we have the address of the actual array _data_ in the IL: # SFT.10_52 = VDEF <SFT.10_51(D)> atmp.0.data = &A.1; the array descriptor itself is not the problem (the redundant descriptors still consume memory, but should not cause compile-time problems where observed). As of optimization, the conversion sequence atmp.0.data = &A.1; D.540_5 = atmp.0.data; D.541_6 = (real(kind=4)[0:] *) D.540_5; ... (*D.541_6)[S.5_1] = D.545_14; can be optimized to A.1[S.5_1] = D.545_14; with the following patch that makes sure we use (real(kind=4)[4] *) instead of the unknown size array type. Index: trans-types.c =================================================================== --- trans-types.c (revision 131336) +++ trans-types.c (working copy) @@ -1511,10 +1511,12 @@ gfc_get_array_type_bounds (tree etype, i /* TODO: known offsets for descriptors. */ GFC_TYPE_ARRAY_OFFSET (fat_type) = NULL_TREE; - /* We define data as an unknown size array. Much better than doing + /* We define data as an array with the correct size. Much better than doing pointer arithmetic. */ arraytype = - build_array_type (etype, gfc_array_range_type); + build_array_type (etype, build_range_type (gfc_array_index_type, + gfc_index_zero_node, int_const_binop (MINUS_EXPR, stride, + integer_one_node, 0))); arraytype = build_pointer_type (arraytype); GFC_TYPE_ARRAY_DATAPTR_TYPE (fat_type) = arraytype; (the patch needs to be adjusted for the cases stride is not the actual array size, but you should get the idea) -- Summary: FE should reuse array temporaries, reduce temporaties and tell ME the array-size type Product: gcc Version: 4.3.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: fortran AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: burnus at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34706