> -----Original Message-----
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of bin.cheng
> Sent: Monday, November 04, 2013 4:35 PM
> To: 'Richard Biener'
> Cc: GCC Patches
> Subject: RE: [PATCH GCC]Simplify address expression in IVOPT
>
>
>
> > -----Original Message-----
> > From: Richard Biener [mailto:richard.guent...@gmail.com]
> > Sent: Wednesday, October 30, 2013 10:46 PM
> > To: Bin Cheng
> > Cc: GCC Patches
> > Subject: Re: [PATCH GCC]Simplify address expression in IVOPT
> >
> > On Tue, Oct 29, 2013 at 10:18 AM, bin.cheng <bin.ch...@arm.com> wrote:
> >
> > Hmm. I think you want what get_inner_reference_aff does, using the
> > return value of get_inner_reference as starting point for
> determine_base_object.
> > And you don't want to restrict yourselves so much on what exprs to
> process,
> > but only exclude DECL_Ps.
> Did you mean I should pass all address expressions to
> get_inner_reference_aff except the one with declaration operand?
> I am a little confused here why DECL_Ps should be handled specially?
Seems
> it can be handled properly by the get_* function. Anything important I
> missed?
>
> > Just amend get_inner_reference_aff to return the tree base object.
> I found it's inappropriate to do this because: functions like
> get_inner_reference* only accept reference expressions, while base_object
> has to be computed for other kinds of expressions too. Take gimple
> statement "a_1 = *ptr_1" as an example, the base passed to alloc_iv is
ptr_1;
> the base_object computed by determine_base_object is also ptr_1, which
> we can't rely on get_inner_reference* .
>
> Also It seems the comment before determine_base_object might be
> inaccurate:
> " Returns a memory object to that EXPR points." which I think is " Returns
a
> pointer pointing to the main memory object to that EXPR points."
> Right?
> >
> > Note that this isn't really "simplifying" but rather lowering all
> addresses.
> >
> > The other changes in this patch are unrelated, right?
> Right, I should have named this message like "refine cost computation of
> expressions in IVOPT" just as the patch file.
Hi,
I updated the patch according to review comments. Now it lowers all address
expressions except ones with DECL_P to get_inner_reference_aff, it also
computes base_object which is used later to ease determine_base_object.
Bootstrap and test on x86/x86_64/arm on going, is it OK if tests pass?
Thanks.
bin
gcc/testsuite/ChangeLog
2013-11-05 Bin Cheng <bin.ch...@arm.com>
* gcc.dg/tree-ssa/loop-2.c: Refine check condition.
* gcc.dg/tree-ssa/ivopt_infer_2.c: Ditto.
* gcc.dg/tree-ssa/ivopt_mult_3.c: Ditto.
2013-11-05 Bin Cheng <bin.ch...@arm.com>
* tree-ssa-loop-ivopts.c (alloc_iv): Lower address expressions.
(get_shiftadd_cost): Check equality using operand_equal_p.
(force_expr_to_var_cost): Refactor the code. Handle type
conversion.
(split_address_cost): Call force_expr_to_var_cost.
* tree-affine.c (get_inner_reference_aff): Return base_addr.
* tree-affine.h (get_inner_reference_aff): Change prototype.
Index: gcc/testsuite/gcc.dg/tree-ssa/loop-2.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-2.c (revision 204117)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-2.c (working copy)
@@ -27,7 +27,7 @@ void xxx(void)
/* { dg-final { scan-tree-dump-times " \\* \[^\\n\\r\]*=" 0 "optimized" } } */
/* { dg-final { scan-tree-dump-times "\[^\\n\\r\]*= \\* " 0 "optimized" } } */
-/* { dg-final { scan-tree-dump-times "MEM" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "MEM\\\[base" 1 "optimized" } } */
/* 17 * iter should be strength reduced. */
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c (revision 204117)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c (working copy)
@@ -7,7 +7,8 @@
extern char a[];
-/* Can not infer loop iteration from array -- exit test can not be replaced.
*/
+/* Can not infer loop iteration from array -- exit test can not be
+ replaced by the array address. */
void foo (unsigned int i_width, TYPE dst)
{
unsigned long long i = 0;
@@ -21,5 +22,5 @@ void foo (unsigned int i_width, TYPE dst)
}
}
-/* { dg-final { scan-tree-dump-times "Replacing" 0 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "\[^:\]*if \\(.*j_\[0-9\]+.*\\)" 1
"ivopts"} } */
/* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_3.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_3.c (revision 204117)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_3.c (working copy)
@@ -18,5 +18,5 @@ long foo(long* p, long* p2, int N1, int N2)
return s;
}
-/* { dg-final { scan-tree-dump-times "Replacing" 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "Replacing exit test: if \\(.*p2.*\\)" 1
"ivopts"} } */
/* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c (revision 204117)
+++ gcc/tree-ssa-loop-ivopts.c (working copy)
@@ -924,11 +924,29 @@ determine_base_object (tree expr)
static struct iv *
alloc_iv (tree base, tree step)
{
+ tree expr = base;
+ tree base_object = base;
struct iv *iv = XCNEW (struct iv);
gcc_assert (step != NULL_TREE);
+ /* Lower all address expressions except ones with DECL_P as oeprand.
+ By doing this:
+ 1) More accurate cost can be computed for address expressions;
+ 2) Duplicate candidates won't be created for bases in different
+ forms, like &a[0] and &a. */
+ STRIP_NOPS (expr);
+ if (TREE_CODE (expr) == ADDR_EXPR && !DECL_P (TREE_OPERAND (expr, 0)))
+ {
+ aff_tree comb;
+ double_int size;
+ base_object = get_inner_reference_aff (TREE_OPERAND (expr, 0),
+ &comb, &size);
+ gcc_assert (base_object != NULL_TREE);
+ base = fold_convert (TREE_TYPE (base), aff_combination_to_tree (&comb));
+ }
+
iv->base = base;
- iv->base_object = determine_base_object (base);
+ iv->base_object = determine_base_object (base_object);
iv->step = step;
iv->biv_p = false;
iv->have_use_for = false;
@@ -3481,17 +3499,21 @@ get_shiftadd_cost (tree expr, enum machine_mode mo
int m = exact_log2 (int_cst_value (cst));
int maxm = MIN (BITS_PER_WORD, GET_MODE_BITSIZE (mode));
int sa_cost;
+ bool equal_p = false;
if (!(m >= 0 && m < maxm))
return false;
+ if (operand_equal_p (op1, mult, 0))
+ equal_p = true;
+
sa_cost = (TREE_CODE (expr) != MINUS_EXPR
? shiftadd_cost (speed, mode, m)
- : (mult == op1
+ : (equal_p
? shiftsub1_cost (speed, mode, m)
: shiftsub0_cost (speed, mode, m)));
res = new_cost (sa_cost, 0);
- res = add_costs (res, mult == op1 ? cost0 : cost1);
+ res = add_costs (res, equal_p ? cost0 : cost1);
STRIP_NOPS (multop);
if (!is_gimple_val (multop))
@@ -3585,30 +3607,14 @@ force_expr_to_var_cost (tree expr, bool speed)
op1 = TREE_OPERAND (expr, 1);
STRIP_NOPS (op0);
STRIP_NOPS (op1);
-
- if (is_gimple_val (op0))
- cost0 = no_cost;
- else
- cost0 = force_expr_to_var_cost (op0, speed);
-
- if (is_gimple_val (op1))
- cost1 = no_cost;
- else
- cost1 = force_expr_to_var_cost (op1, speed);
-
break;
+ case NOP_EXPR:
+ case CONVERT_EXPR:
case NEGATE_EXPR:
op0 = TREE_OPERAND (expr, 0);
STRIP_NOPS (op0);
op1 = NULL_TREE;
-
- if (is_gimple_val (op0))
- cost0 = no_cost;
- else
- cost0 = force_expr_to_var_cost (op0, speed);
-
- cost1 = no_cost;
break;
default:
@@ -3616,6 +3622,18 @@ force_expr_to_var_cost (tree expr, bool speed)
return new_cost (target_spill_cost[speed], 0);
}
+ if (op0 == NULL_TREE
+ || (is_gimple_val (op0) && (TREE_CODE (op0) != ADDR_EXPR)))
+ cost0 = no_cost;
+ else
+ cost0 = force_expr_to_var_cost (op0, speed);
+
+ if (op1 == NULL_TREE
+ || (is_gimple_val (op1) && (TREE_CODE (op1) != ADDR_EXPR)))
+ cost1 = no_cost;
+ else
+ cost1 = force_expr_to_var_cost (op1, speed);
+
mode = TYPE_MODE (TREE_TYPE (expr));
switch (TREE_CODE (expr))
{
@@ -3639,7 +3657,18 @@ force_expr_to_var_cost (tree expr, bool speed)
speed, &sa_cost))
return sa_cost;
}
+
break;
+ case NOP_EXPR:
+ case CONVERT_EXPR:
+ {
+ tree inner_mode, outer_mode;
+ outer_mode = TREE_TYPE (expr);
+ inner_mode = TREE_TYPE (op0);
+ cost = new_cost (convert_cost (TYPE_MODE (outer_mode),
+ TYPE_MODE (inner_mode), speed), 0);
+ }
+ break;
case MULT_EXPR:
if (cst_and_fits_in_hwi (op0))
@@ -3713,7 +3742,7 @@ split_address_cost (struct ivopts_data *data,
*var_present = true;
fd_ivopts_data = data;
walk_tree (&addr, find_depends, depends_on, NULL);
- return new_cost (target_spill_cost[data->speed], 0);
+ return force_expr_to_var_cost (addr, data->speed);
}
*offset += bitpos / BITS_PER_UNIT;
Index: gcc/tree-affine.c
===================================================================
--- gcc/tree-affine.c (revision 204117)
+++ gcc/tree-affine.c (working copy)
@@ -874,10 +874,11 @@ debug_aff (aff_tree *val)
fprintf (stderr, "\n");
}
-/* Returns address of the reference REF in ADDR. The size of the accessed
- location is stored to SIZE. */
+/* Computes address of the reference REF in ADDR. The size of the accessed
+ location is stored to SIZE. Returns pointer to the ultimate containing
+ object to which REF refers. */
-void
+tree
get_inner_reference_aff (tree ref, aff_tree *addr, double_int *size)
{
HOST_WIDE_INT bitsize, bitpos;
@@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree *addr,
aff_combination_add (addr, &tmp);
*size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) /
BITS_PER_UNIT);
+
+ return base_addr;
}
/* Returns true if a region of size SIZE1 at position 0 and a region of
Index: gcc/tree-affine.h
===================================================================
--- gcc/tree-affine.h (revision 204117)
+++ gcc/tree-affine.h (working copy)
@@ -74,7 +74,7 @@ bool aff_combination_constant_multiple_p (aff_tree
void aff_combination_expand (aff_tree *, struct pointer_map_t **);
void tree_to_aff_combination_expand (tree, tree, aff_tree *,
struct pointer_map_t **);
-void get_inner_reference_aff (tree, aff_tree *, double_int *);
+tree get_inner_reference_aff (tree, aff_tree *, double_int *);
void free_affine_expand_cache (struct pointer_map_t **);
bool aff_comb_cannot_overlap_p (aff_tree *, double_int, double_int);