Re: [PATCH] Middle-end arrays, forward-ported to trunk (again)

2011-06-22 Thread Richard Guenther
On Wed, 22 Jun 2011, Michael Matz wrote:

> Hi,
> 
> On Tue, 21 Jun 2011, Richard Guenther wrote:
> 
> > I failed to see where the scalarizer inserts the temporary vars it 
> > creates into the scope blocks (thus the gimplify.c hunk ...).  Any help 
> > here is welcome.
> 
> The scoping of the scalarizer is a bit funny.  gfc_start_scalarized_body 
> sets up scopes for all dimensions and leaves with the 'body' scope open.  
> The bound expressions are inserted (for your testcase) into the 'block' 
> scope.  In between there are the loop->code[n] scopes.
> 
> You can't decide to not go into gfc_start_scalarized_body early, because 
> lse.expr will only be set later, so you have to 
> properly finish_block all these in-between blocks and wire them into 
> loop.pre and then block, like gfc_trans_scalarizing_loops would do.  Of 
> course you don't want to actually generate loops, which 
> gfc_trans_scalarizing_loops does.  So you have to manually unwind the 
> blocks.  If you do that, then you also don't need the ??? marked 
> gfc_add_block_to_block (&block, &loop.pre).
> 
> So, the code in the VLA_VIEW_EXPR case should be roughly this:
> 
>   else if (TREE_CODE (lse.expr) == VLA_VIEW_EXPR)
> {
>   int dim;
>   stmtblock_t *pblock;
> 
>   pblock = &body;
>   for (dim = 0; dim < loop.dimen + loop.codimen; dim++)
> {
>   n = loop.order[dim];
>   tmp = gfc_finish_block (pblock);
>   gfc_add_expr_to_block (&loop.code[n], tmp);
>   loop.loopvar[n] = NULL_TREE;
>   pblock = &loop.code[n];
> }
> 
>   tmp = gfc_finish_block (pblock);
>   gfc_add_expr_to_block (&loop.pre, tmp);
>   gfc_add_block_to_block (&block, &loop.pre);
>   gfc_add_block_to_block (&block, &loop.post);
>   gfc_cleanup_loop (&loop);
> }
> 
> Sorry, no real patch, my quilt queue is busted somehow.

Thanks, that seems to work.

The following is an updated Fortran patch (requiring an updated
middle-end patch as well) which also adds code to handle
intrinsic ALL/ANY (in the hope more runtime tests get coverage
that way).

Richard.

Index: trunk/gcc/fortran/trans-array.c
===
*** trunk.orig/gcc/fortran/trans-array.c2011-06-22 14:28:27.0 
+0200
--- trunk/gcc/fortran/trans-array.c 2011-06-22 14:39:40.0 +0200
*** gfc_conv_scalarized_array_ref (gfc_se *
*** 2570,2575 
--- 2570,2637 
int n;
  
info = &se->ss->data.info;
+ 
+ {
+   tree vv = build_vl_exp (VLA_VIEW_EXPR, 2 + 2 * info->dimen);
+   tree vi = build_vl_exp (VLA_IDX_EXPR, 2 + info->dimen);
+   tree offset = NULL_TREE;
+   tree elt_type;
+   tree type;
+   tree tem;
+   elt_type = TREE_TYPE (se->expr);
+   if (POINTER_TYPE_P (elt_type))
+   elt_type = TREE_TYPE (elt_type);
+   while (TREE_CODE (elt_type) == ARRAY_TYPE)
+   elt_type = TREE_TYPE (elt_type);
+   type = elt_type;
+   for (n = 0; n < info->dimen; ++n)
+   {
+ TREE_OPERAND (vv, 2 + 2 * n)  /* extent */
+   = build2 (MINUS_EXPR, TREE_TYPE (info->end[n]),
+ info->end[n], info->start[n]);
+ TREE_OPERAND (vv, 3 + 2 * n)  /* stride */
+   = info->stride[n];
+ /* Accumulate start offset.  */
+ if (offset)
+   offset = fold_build2_loc (input_location,
+ PLUS_EXPR, TREE_TYPE (offset),
+ offset,
+ fold_build2_loc (input_location,
+  MULT_EXPR, TREE_TYPE 
(offset),
+  info->start[n],
+  info->stride[n]));
+ else
+   offset = fold_build2_loc (input_location,
+ MULT_EXPR, TREE_TYPE (info->start[n]),
+ info->start[n],
+ info->stride[n]);
+ type = build_array_type (type, build_index_type
+  (TREE_OPERAND (vv, 2 + 2 * n)));
+   }
+   /* Start address.  */
+   tem = info->descriptor;
+   if (!POINTER_TYPE_P (TREE_TYPE (tem)))
+   tem = build_fold_addr_expr (tem);
+   TREE_OPERAND (vv, 1)
+   = fold_build2 (MEM_REF, type,
+  fold_build2_loc (input_location,
+   POINTER_PLUS_EXPR,
+   TREE_TYPE (tem), tem,
+   fold_convert (sizetype, offset)),
+  build_int_cst (build_pointer_type (elt_type), 0));
+   /* Type.  */
+   TREE_TYPE (vv) = type;
+   TREE_OPERAND (vi, 1) = vv;
+   for (n = 0; n < info->dimen; ++n)
+   {
+ TREE_OPERAND (vi, 2 + n) /* Index placeholder.  */
+   = se-

Re: [PATCH] Middle-end arrays, forward-ported to trunk (again)

2011-06-22 Thread Michael Matz
Hi,

On Tue, 21 Jun 2011, Richard Guenther wrote:

> I failed to see where the scalarizer inserts the temporary vars it 
> creates into the scope blocks (thus the gimplify.c hunk ...).  Any help 
> here is welcome.

The scoping of the scalarizer is a bit funny.  gfc_start_scalarized_body 
sets up scopes for all dimensions and leaves with the 'body' scope open.  
The bound expressions are inserted (for your testcase) into the 'block' 
scope.  In between there are the loop->code[n] scopes.

You can't decide to not go into gfc_start_scalarized_body early, because 
lse.expr will only be set later, so you have to 
properly finish_block all these in-between blocks and wire them into 
loop.pre and then block, like gfc_trans_scalarizing_loops would do.  Of 
course you don't want to actually generate loops, which 
gfc_trans_scalarizing_loops does.  So you have to manually unwind the 
blocks.  If you do that, then you also don't need the ??? marked 
gfc_add_block_to_block (&block, &loop.pre).

So, the code in the VLA_VIEW_EXPR case should be roughly this:

  else if (TREE_CODE (lse.expr) == VLA_VIEW_EXPR)
{
  int dim;
  stmtblock_t *pblock;

  pblock = &body;
  for (dim = 0; dim < loop.dimen + loop.codimen; dim++)
{
  n = loop.order[dim];
  tmp = gfc_finish_block (pblock);
  gfc_add_expr_to_block (&loop.code[n], tmp);
  loop.loopvar[n] = NULL_TREE;
  pblock = &loop.code[n];
}

  tmp = gfc_finish_block (pblock);
  gfc_add_expr_to_block (&loop.pre, tmp);
  gfc_add_block_to_block (&block, &loop.pre);
  gfc_add_block_to_block (&block, &loop.post);
  gfc_cleanup_loop (&loop);
}

Sorry, no real patch, my quilt queue is busted somehow.


Ciao,
Michael.


Re: [PATCH] Middle-end arrays, forward-ported to trunk (again)

2011-06-21 Thread Mikael Morin
On Tuesday 21 June 2011 17:08:17 Richard Guenther wrote:
> The following is a patch^Whack that should handle almost all
> array exprs.  Well, tried on the following testcase:
> 
> subroutine foo (dst, bar, ni, nj)
>   integer, intent(in) :: ni, nj
>   double precision, intent(in) :: bar(ni, nj)
>   double precision, intent(out) :: dst(ni, nj)
> 
>   dst(2:ni-1,2:nj-1) = 5. + bar(2:ni-1,2:nj-1)
> end subroutine foo
> 
> I failed to see where the scalarizer inserts the temporary vars it
> creates into the scope blocks (thus the gimplify.c hunk ...).  Any
> help here is welcome.
If you are talking about the partial offset calculations in nested loops, they 
come from gfc_trans_preloop_setup.
Sorry, no time to help now.

Mikael

PS: nice to see MEAs come back. :-)



Re: [PATCH] Middle-end arrays, forward-ported to trunk (again)

2011-06-21 Thread Paul Richard Thomas
Dear Richi,

The point of entry for assignments is in trans-expr.c
06038 tree
06039 gfc_trans_assignment (gfc_expr * expr1, gfc_expr * expr2, bool init_flag,
06040   bool dealloc)
06041 {

a bunch of special cases

06088
06089   /* Fallback to the scalarizer to generate explicit loops.  */
06090   return gfc_trans_assignment_1 (expr1, expr2, init_flag, dealloc);
06091 }

One of the special cases is copying one array to another.  You might
do well to use this function and to create special cases as you go
along.  It's well clear of the scalarizer :-)

BTW the scalarizer does not modify the lhs/rhs expressions (expr1/expr2).

When doing something similar in the past, I gave the lhs a special
name and selected on that so that I did not have to mess around with
the conditions for the special cases.

I hope that this helps.

Cheers

Paul

On Tue, Jun 21, 2011 at 5:08 PM, Richard Guenther  wrote:
> On Tue, 21 Jun 2011, Richard Guenther wrote:
>
>>
>> This forward-ports the middle-end array patch.  Patch status is the
>> same as with the last forward-port from 2009 - the scalarizer cannot
>> deal with control flow and it cannot insert temporaries when
>> required (so it operates in a mode that requires the frontend to
>> insert those that are necessary).
>>
>> Expected FAILs of the C/C++ frontend-hack testcases are
>>
>> FAIL: g++.dg/tree-ssa/mea-1.C (test for excess errors)
>> FAIL: gcc.dg/mea-1.c (test for excess errors)
>>
>> the gimplifier isn't able to properly translate the C/C++ builtin hack
>>
>> FAIL: g++.dg/tree-ssa/mea-10.C (internal compiler error)
>> FAIL: g++.dg/tree-ssa/mea-10.C (test for excess errors)
>> FAIL: gcc.dg/mea-15.c (internal compiler error)
>> FAIL: gcc.dg/mea-15.c (test for excess errors)
>>
>> aforementioned case of not handling control flow (both testcases
>> simulating U(:) = WHERE (U(:) < 0, 0) which would be handled
>> if the scalarizer sees an if-converted COND_EXPR).
>>
>> Not further tested (I'll try bootstrapping now).  I didn't touch
>> the ChangeLog.
>>
>> Disclaimer: During the London Gathering we discussed how to eventually
>> move forward with respect to Fortran using middle-end arrays.  This is
>> step1, forward-port the existing patch.
>>
>> Hints on where I could lower simple fortran array expressions from
>> the frontend are appreciated.  Simple first example:
>>
>>   subroutine copy (dst, src, ni, nj)
>>     integer, intent(in) :: ni, nj
>>     double precision, intent(out) :: dst(ni, nj)
>>     double precision, intent(in)  :: src(ni, nj)
>>     dst = src
>>   end subroutine copy
>>
>> If I have a first Fortran bit working I will probably disentangle
>> the C/C++ frontend hacks and put the rest on a branch.
>
> The following is a patch^Whack that should handle almost all
> array exprs.  Well, tried on the following testcase:
>
> subroutine foo (dst, bar, ni, nj)
>  integer, intent(in) :: ni, nj
>  double precision, intent(in) :: bar(ni, nj)
>  double precision, intent(out) :: dst(ni, nj)
>
>  dst(2:ni-1,2:nj-1) = 5. + bar(2:ni-1,2:nj-1)
> end subroutine foo
>
> I failed to see where the scalarizer inserts the temporary vars it
> creates into the scope blocks (thus the gimplify.c hunk ...).  Any
> help here is welcome.
>
> The patch also currently runs into type verification issues that
> are bugs in the scalarizer (I'll fix that), thus the tree-cfg.c hunk.
> And the scalarizer has the issue that it doesn't run at -O0, so we
> can't expand there (probably the Fortran scalarizer should be
> used at -O0?).  I guess I'll fix it by running it unconditionally
> (in the past SSA info wasn't available at -O0 but it relies on that).
>
> For some reason (I guess because of the type mismatches) we
> miscompile the loop at -O2 (disabling VRP helps).
>
> Anyway, just as an update, if you want to desperately experiment ;)
>
> Richard.
>
> Index: trunk/gcc/fortran/trans-array.c
> ===
> *** trunk.orig/gcc/fortran/trans-array.c        2011-06-14 12:41:32.0 
> +0200
> --- trunk/gcc/fortran/trans-array.c     2011-06-21 16:52:41.0 +0200
> *** gfc_conv_scalarized_array_ref (gfc_se *
> *** 2570,2575 
> --- 2570,2622 
>    int n;
>
>    info = &se->ss->data.info;
> +
> +     {
> +       tree vv = build_vl_exp (VLA_VIEW_EXPR, 2 + 2 * info->dimen);
> +       tree vi = build_vl_exp (VLA_IDX_EXPR, 2 + info->dimen);
> +       tree offset = build_int_cst (gfc_array_index_type, 0);
> +       tree elt_type = double_type_node; /* FIXME */
> +       tree type = elt_type;
> +       for (n = 0; n < info->dimen; ++n)
> +       {
> +         TREE_OPERAND (vv, 2 + 2 * n)  /* extent */
> +           = build2 (MINUS_EXPR, TREE_TYPE (info->end[n]),
> +                     info->end[n], info->start[n]);
> +         TREE_OPERAND (vv, 3 + 2 * n)  /* stride */
> +           = info->stride[n];
> +         /* Accumulate start offset.  */
> +         offset = fold_build2_loc (input_loca

Re: [PATCH] Middle-end arrays, forward-ported to trunk (again)

2011-06-21 Thread Richard Guenther
On Tue, 21 Jun 2011, Richard Guenther wrote:

> 
> This forward-ports the middle-end array patch.  Patch status is the
> same as with the last forward-port from 2009 - the scalarizer cannot
> deal with control flow and it cannot insert temporaries when
> required (so it operates in a mode that requires the frontend to
> insert those that are necessary).
> 
> Expected FAILs of the C/C++ frontend-hack testcases are
> 
> FAIL: g++.dg/tree-ssa/mea-1.C (test for excess errors)
> FAIL: gcc.dg/mea-1.c (test for excess errors)
> 
> the gimplifier isn't able to properly translate the C/C++ builtin hack
> 
> FAIL: g++.dg/tree-ssa/mea-10.C (internal compiler error)
> FAIL: g++.dg/tree-ssa/mea-10.C (test for excess errors)
> FAIL: gcc.dg/mea-15.c (internal compiler error)
> FAIL: gcc.dg/mea-15.c (test for excess errors)
> 
> aforementioned case of not handling control flow (both testcases
> simulating U(:) = WHERE (U(:) < 0, 0) which would be handled
> if the scalarizer sees an if-converted COND_EXPR).
> 
> Not further tested (I'll try bootstrapping now).  I didn't touch
> the ChangeLog.
> 
> Disclaimer: During the London Gathering we discussed how to eventually
> move forward with respect to Fortran using middle-end arrays.  This is
> step1, forward-port the existing patch.
> 
> Hints on where I could lower simple fortran array expressions from
> the frontend are appreciated.  Simple first example:
> 
>   subroutine copy (dst, src, ni, nj)
> integer, intent(in) :: ni, nj
> double precision, intent(out) :: dst(ni, nj)
> double precision, intent(in)  :: src(ni, nj)
> dst = src
>   end subroutine copy
> 
> If I have a first Fortran bit working I will probably disentangle
> the C/C++ frontend hacks and put the rest on a branch.

The following is a patch^Whack that should handle almost all
array exprs.  Well, tried on the following testcase:

subroutine foo (dst, bar, ni, nj)
  integer, intent(in) :: ni, nj
  double precision, intent(in) :: bar(ni, nj)
  double precision, intent(out) :: dst(ni, nj)

  dst(2:ni-1,2:nj-1) = 5. + bar(2:ni-1,2:nj-1)
end subroutine foo

I failed to see where the scalarizer inserts the temporary vars it
creates into the scope blocks (thus the gimplify.c hunk ...).  Any
help here is welcome.

The patch also currently runs into type verification issues that
are bugs in the scalarizer (I'll fix that), thus the tree-cfg.c hunk.
And the scalarizer has the issue that it doesn't run at -O0, so we
can't expand there (probably the Fortran scalarizer should be
used at -O0?).  I guess I'll fix it by running it unconditionally
(in the past SSA info wasn't available at -O0 but it relies on that).

For some reason (I guess because of the type mismatches) we
miscompile the loop at -O2 (disabling VRP helps).

Anyway, just as an update, if you want to desperately experiment ;)

Richard.

Index: trunk/gcc/fortran/trans-array.c
===
*** trunk.orig/gcc/fortran/trans-array.c2011-06-14 12:41:32.0 
+0200
--- trunk/gcc/fortran/trans-array.c 2011-06-21 16:52:41.0 +0200
*** gfc_conv_scalarized_array_ref (gfc_se *
*** 2570,2575 
--- 2570,2622 
int n;
  
info = &se->ss->data.info;
+ 
+ {
+   tree vv = build_vl_exp (VLA_VIEW_EXPR, 2 + 2 * info->dimen);
+   tree vi = build_vl_exp (VLA_IDX_EXPR, 2 + info->dimen);
+   tree offset = build_int_cst (gfc_array_index_type, 0);
+   tree elt_type = double_type_node; /* FIXME */
+   tree type = elt_type;
+   for (n = 0; n < info->dimen; ++n)
+   {
+ TREE_OPERAND (vv, 2 + 2 * n)  /* extent */
+   = build2 (MINUS_EXPR, TREE_TYPE (info->end[n]),
+ info->end[n], info->start[n]);
+ TREE_OPERAND (vv, 3 + 2 * n)  /* stride */
+   = info->stride[n];
+ /* Accumulate start offset.  */
+ offset = fold_build2_loc (input_location,
+   PLUS_EXPR, TREE_TYPE (offset),
+   offset,
+   fold_build2_loc (input_location,
+MULT_EXPR, TREE_TYPE 
(offset),
+info->start[n],
+info->stride[n]));
+ type = build_array_type (type, build_index_type
+  (TREE_OPERAND (vv, 2 + 2 * n)));
+   }
+   /* Start address.  */
+   TREE_OPERAND (vv, 1)
+   = build2 (MEM_REF, type,
+ fold_build2_loc (input_location,
+  POINTER_PLUS_EXPR,
+  TREE_TYPE (info->descriptor),
+  info->descriptor,
+  fold_convert (sizetype, offset)),
+ build_int_cst (build_pointer_type (elt_type), 0));
+   /* Type.  */
+   TREE_TYPE (vv) = type;