On Sat, Nov 08, 2014 at 12:32:05AM +0100, Mircea Namolaru wrote: > Hello, > > This is the patch for unroll and jam optimizations. It is based on the > code written by Tobias from graphite-optimize-isl.c (the code was > unreachable till now) extended and enabled it via a new option > -floop-unroll-jam. > The patch takes advantage of the new isl based code generator introduced > recently > in GCC (in fact of the possible options for building the AST). > > The code generated for this optimization in the case of non-constant loop > bounds > initially looks as below. This is not very useful because the standard GCC > unrolling don't succeed to unroll the most inner loop. > > ISL AST generated by ISL: > for (int c0 = 0; c0 < HEIGHT; c0 += 4) > for (int c1 = 0; c1 < LENGTH - 3; c1 += 1) > for (int c2 = c0; c2 <= min(HEIGHT - 1, c0 + 3); c2 += 1) > S_4(c2, c1); > > Now, the "separating class" option (set for unroll and jam) produces this > nice loop > structure:
I'm not sure if Tobi or Albert have told you, but the separation_class option is going to be phased out since its design is fundamentally flawed. If you can't wait until isl-0.15, then I guess you have no choice but to use this option, but you should realize that it will remain frozen in its current broken state (until it is removed at some point). > + > +/* Set the unroll AST built option. */ > + > +static __isl_give isl_union_map * > +generate_isl_options_2 (scop_p scop, __isl_take isl_union_set *domain, > + int dim, int cl) Not a very descriptive function name. > +{ > + isl_map *map; > + isl_space *space, *space_sep; > + isl_ctx *ctx; > + isl_union_map *mapu; > + int nsched = get_max_schedule_dimensions (scop); > + > + ctx = scop->ctx; > + space_sep = isl_space_alloc(ctx, 0, 1, 1); > + space_sep = isl_space_wrap(space_sep); > + space_sep = isl_space_set_tuple_name(space_sep, isl_dim_set, > + "separation_class"); > + space = isl_set_get_space (scop->context); > + space_sep = isl_space_align_params(space_sep, isl_space_copy(space)); > + space = isl_space_map_from_domain_and_range(space, space_sep); > + space = isl_space_add_dims (space,isl_dim_in, nsched); Inconsistent spacing, sometimes with a space before "(" and sometimes without. I also noticed some tab/spacing inconsistencies later in the patch, but I already removed that part in my reply. > + /* Extract the original and auxiliar maps from pbb->transformed. > + Set pbb->transformed to the original map. */ > + psmap = &smap; > + psmap->n = 0; > + res = isl_map_foreach_basic_map (pbb->transformed, separate_map, (void > *)psmap); > + gcc_assert (res == 0); > + > + isl_map_free(pbb->transformed); > + pbb->transformed = isl_map_copy(psmap->map_arr[0]); > + I have no idea what this pbb->transformed is supposed to represent, but you appear to be assuming that it has exactly two disjuncts and that they appear in some order. Now, perhaps you have explicitly checked that this map has two disjuncts, but then you should probably bring the check closer since any operation on sets that you perform could change the internal representation (even of other sets). However, in no way can you assume that isl_map_foreach_basic_map will iterate over these disjuncts in any specific order. > + bb_domain_s = isl_set_apply (isl_set_copy (bb_domain), > + psmap->map_arr[1]); > + bb_domain_r = isl_set_apply (bb_domain, psmap->map_arr[0]); > + > + bb_domain_s = isl_set_complement (bb_domain_s); > + bb_domain_r = isl_set_subtract(bb_domain_r,bb_domain_s); Why are you subtracting the complement of a set instead of just intersecting with that set? > + /* Create an identity map for everything except DimToVectorize and the > + point loop. */ > + for (i = 0; i < ScheduleDimensions; i++) > + { > + if (i == DimToVectorize) > + continue; > + > + c = isl_equality_alloc (isl_local_space_copy (LocalSpace)); > + > + isl_constraint_set_coefficient_si (c, isl_dim_in, i, -1); > + isl_constraint_set_coefficient_si (c, isl_dim_out, i, 1); > + > + TilingMap = isl_map_add_constraint (TilingMap, c); You can use isl_map_equate instead. > @@ -350,17 +425,31 @@ > SuffixSchedule); > isl_band_list_free (Children); Gaack! gcc is using band forests too... I guess I'll have to keep them around for a while then.... skimo