On Tue, Mar 29, 2016 at 9:37 AM, Richard Biener
<[email protected]> wrote:
> On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <[email protected]> wrote:
>> Sorry, Should have replied to gcc-patches list.
>>
>> Thanks,
>> bin
>>
>> ---------- Forwarded message ----------
>> From: "Bin.Cheng" <[email protected]>
>> Date: Tue, 29 Mar 2016 03:55:04 +0800
>> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
>> DR against its innermost loop bahavior if possible
>> To: Richard Biener <[email protected]>
>>
>> On 3/17/16, Richard Biener <[email protected]> wrote:
>>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <[email protected]> wrote:
>>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>>>> <[email protected]> wrote:
>>>>>
>>>>> Hmm.
>>>> Hi,
>>>> Thanks for reviewing.
>>>>>
>>>>> + equal_p = true;
>>>>> + if (e1->base_address && e2->base_address)
>>>>> + equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
>>>>> + if (e1->offset && e2->offset)
>>>>> + equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>>>>>
>>>>> surely better to return false early.
>>>>>
>>>>> I think we don't want this in tree-data-refs.h also because of ...
>>>>>
>>>>> @@ -615,15 +619,29 @@
>>>>> hash_memrefs_baserefs_and_store_DRs_read_written_info
>>>>> (data_reference_p a)
>>>>> data_reference_p *master_dr, *base_master_dr;and REALPART) before
>>>>> creating the DR (or adjust the equality function
>>>> and hashing
>>>>> tree ref = DR_REF (a);
>>>>> tree base_ref = DR_BASE_OBJECT (a);
>>>>> + innermost_loop_behavior *innermost = &DR_INNERMOST (a);
>>>>> tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>>>>> bool exist1, exist2;
>>>>>
>>>>> - while (TREE_CODE (ref) == COMPONENT_REF
>>>>> - || TREE_CODE (ref) == IMAGPART_EXPR
>>>>> - || TREE_CODE (ref) == REALPART_EXPR)
>>>>> - ref = TREE_OPERAND (ref, 0);
>>>>> + /* If reference in DR has innermost loop behavior and it is not
>>>>> + a compound memory reference, we store it to innermost_DR_map,
>>>>> + otherwise to ref_DR_map. */
>>>>> + if (TREE_CODE (ref) == COMPONENT_REF
>>>>> + || TREE_CODE (ref) == IMAGPART_EXPR
>>>>> + || TREE_CODE (ref) == REALPART_EXPR
>>>>> + || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
>>>>> + || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
>>>>> + {
>>>>> + while (TREE_CODE (ref) == COMPONENT_REF
>>>>> + || TREE_CODE (ref) == IMAGPART_EXPR
>>>>> + || TREE_CODE (ref) == REALPART_EXPR)
>>>>> + ref = TREE_OPERAND (ref, 0);
>>>>> +
>>>>> + master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
>>>>> + }
>>>>> + else
>>>>> + master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
>>>>>
>>>>> we don't want an extra hashmap but replace ref_DR_map entirely. So we'd
>>>>> need to
>>>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
>>>>> and REALPART) before creating the DR (or adjust the equality function
>>>>> and hashing
>>>>> to disregard them which means subtracting their offset from DR_INIT.
>>>> I am not sure if I understand correctly. But for component reference,
>>>> it is the base object that we want to record/track. For example,
>>>>
>>>> for (i = 0; i < N; i++) {
>>>> m = *data++;
>>>>
>>>> m1 = p1->x - m;
>>>> m2 = p2->x + m;
>>>>
>>>> p3->y = (m1 >= m2) ? p1->y : p2->y;
>>>>
>>>> p1++;
>>>> p2++;
>>>> p3++;
>>>> }
>>>> We want to infer that reads of p1/p2 in condition statement won't trap
>>>> because there are unconditional reads of the structures, though the
>>>> unconditional reads are actual of other sub-objects. Here it is the
>>>> invariant part of address that we want to track.
>>>
>>> Well, the variant parts - we want to strip invariant parts as far as we can
>>> (offsetof (x) and offsetof (y))
>>>
>>>> Also illustrated by this example, we can't rely on data-ref analyzer
>>>> here. Because in gathering/scattering cases, the address could be not
>>>> affine at all.
>>>
>>> Sure, but that's a different issue.
>>>
>>>>>
>>>>> To adjust the references we collect you'd maybe could use a callback
>>>>> to get_references_in_stmt
>>>>> to adjust them.
>>>>>
>>>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple
>>>>> as
>>>> Is this a part of the method you suggested above, or is it an
>>>> alternative one? If it's the latter, then I have below questions
>>>> embedded.
>>>
>>> It is an alternative to adding a hook to get_references_in_stmt and
>>> probably "easier".
>>>
>>>>>
>>>>> Index: tree-if-conv.c
>>>>> ===================================================================
>>>>> --- tree-if-conv.c (revision 234215)
>>>>> +++ tree-if-conv.c (working copy)
>>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>>>>
>>>>> for (i = 0; refs->iterate (i, &dr); i++)
>>>>> {
>>>>> + tree *refp = &DR_REF (dr);
>>>>> + while ((TREE_CODE (*refp) == COMPONENT_REF
>>>>> + && TREE_OPERAND (*refp, 2) == NULL_TREE)
>>>>> + || TREE_CODE (*refp) == IMAGPART_EXPR
>>>>> + || TREE_CODE (*refp) == REALPART_EXPR)
>>>>> + refp = &TREE_OPERAND (*refp, 0);
>>>>> + if (refp != &DR_REF (dr))
>>>>> + {
>>>>> + tree saved_base = *refp;
>>>>> + *refp = integer_zero_node;
>>>>> +
>>>>> + if (DR_INIT (dr))
>>>>> + {
>>>>> + tree poffset;
>>>>> + int punsignedp, preversep, pvolatilep;
>>>>> + machine_mode pmode;
>>>>> + HOST_WIDE_INT pbitsize, pbitpos;
>>>>> + get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos,
>>>>> &poffset,
>>>>> + &pmode, &punsignedp, &preversep,
>>>>> &pvolatilep,
>>>>> + false);
>>>>> + gcc_assert (poffset == NULL_TREE);
>>>>> +
>>>>> + DR_INIT (dr)
>>>>> + = wide_int_to_tree (ssizetype,
>>>>> + wi::sub (DR_INIT (dr),
>>>>> + pbitpos / BITS_PER_UNIT));
>>>>> + }
>>>>> +
>>>>> + *refp = saved_base;
>>>>> + DR_REF (dr) = *refp;
>>>>> + }
>>>> Looks to me the code is trying to resolve difference between two (or
>>>> more) component references, which is DR_INIT in the code. But DR_INIT
>>>> is not the only thing needs to be handled. For a structure containing
>>>> two sub-arrays, DR_OFFSET may be different too.
>>>
>>> Yes, but we can't say that if
>>>
>>> a->a[i]
>>>
>>> doesn't trap that then
>>>
>>> a->b[i]
>>>
>>> doesn't trap either. We can only "strip" outermost
>>> non-variable-offset components.
>>>
>>> But maybe I'm missing what example you are thinking of.
>> Hmm, this was the case I meant. What I don't understand is current
>> code logic does infer trap information for a.b[i] from a.a[i]. Given
>> below example:
>> struct str
>> {
>> int a[10];
>> int b[20];
>> char c;
>> };
>>
>> void bar (struct str *);
>> int foo (int x, int n)
>> {
>> int i;
>> struct str s;
>> bar (&s);
>> for (i = 0; i < n; i++)
>> {
>> s.a[i] = s.b[i];
>> if (x > i)
>> s.b[i] = 0;
>> }
>> bar (&s);
>> return 0;
>> }
>> The loop is convertible because of below code in function
>> ifcvt_memrefs_wont_trap:
>>
>> /* If a is unconditionally accessed then ... */
>> if (DR_RW_UNCONDITIONALLY (*master_dr))
>> {
>> /* an unconditional read won't trap. */
>> if (DR_IS_READ (a))
>> return true;
>>
>> /* an unconditionaly write won't trap if the base is written
>> to unconditionally. */
>> if (base_master_dr
>> && DR_BASE_W_UNCONDITIONALLY (*base_master_dr))
>> return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>> else
>> {
>> /* or the base is know to be not readonly. */
>> tree base_tree = get_base_address (DR_REF (a));
>> if (DECL_P (base_tree)
>> && decl_binds_to_current_def_p (base_tree)
>> && ! TREE_READONLY (base_tree))
>> return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>> }
>> }
>> It is the main object '&s' that is recorded in base_master_dr, so
>> s.b[i] is considered not trap. Even it's not, I suppose
>> get_base_address will give same result?
>
> Well, for this case it sees that s.b[i] is read from so it can't be an
> out-of-bound
> access. And s.a[i] is written to unconditionally so 's' cannot be a
> readonly object.
> With both pieces of information we can conclude that s.b[i] = 0 doesn't trap.
Hi Richard,
Attachment is the updated patch. I made below changes wrto your
review comments:
1) Moved hash class for innermost loop behavior from tree-data-ref.h
to tree-if-conv.c.
I also removed check on innermost_loop_behavior.aligned_to which
seems redundant to me because it's computed from offset and we have
already checked equality for offset.
2) Replaced ref_DR_map with new map innermost_DR_map.
3) Post-processed DR in if_convertible_loop_p_1 for compound reference
or references don't have innermost behavior. This cleans up following
code using DR information.
4) Added a vectorization test for PR56625.
I didn't incorporate your proposed code because I think it handles
COMPONENT_REF of structure array like struct_arr[i].x/struct_arr[i].y.
Looks to me it is another ifcvt opportunity different from PR69489.
Anyway, fix is easy, I can send another patch in GCC7.
Bootstrap and test on x86_64/AArch64, so any comments on this version?
Thanks,
bin
2016-03-30 Bin Cheng <[email protected]>
PR tree-optimization/56625
PR tree-optimization/69489
* tree-data-ref.h (DR_INNERMOST): New macro.
* tree-if-conv.c (innermost_loop_behavior_hash): New class for
hashing struct innermost_loop_behavior.
(ref_DR_map): Remove.
(innermost_DR_map): New map.
(baseref_DR_map): Revise comment.
(hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR
to innermost_DR_map accroding to its innermost loop behavior.
(ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map according
to its innermost loop behavior.
(if_convertible_loop_p_1): Remove intialization for ref_DR_map.
Add initialization for innermost_DR_map. Record memory reference
in DR_BASE_ADDRESS if the reference is compound one or it doesn't
have innermost loop behavior.
(if_convertible_loop_p): Remove release for ref_DR_map. Release
innermost_DR_map.
gcc/testsuite/ChangeLog
2016-03-30 Bin Cheng <[email protected]>
PR tree-optimization/56625
PR tree-optimization/69489
* gcc.dg/vect/pr56625.c: New test.
* gcc.dg/tree-ssa/ifc-pr69489-1.c: New test.
diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h
index eb24d16..856dd58 100644
--- a/gcc/tree-data-ref.h
+++ b/gcc/tree-data-ref.h
@@ -144,6 +144,7 @@ struct data_reference
#define DR_STEP(DR) (DR)->innermost.step
#define DR_PTR_INFO(DR) (DR)->alias.ptr_info
#define DR_ALIGNED_TO(DR) (DR)->innermost.aligned_to
+#define DR_INNERMOST(DR) (DR)->innermost
typedef struct data_reference *data_reference_p;
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 9e305c7..884006c 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -114,16 +114,68 @@ along with GCC; see the file COPYING3. If not see
#include "builtins.h"
#include "params.h"
+/* Hash for struct innermost_loop_behavior. It depends on the user to
+ free the memory. */
+
+struct innermost_loop_behavior_hash : nofree_ptr_hash <innermost_loop_behavior>
+{
+ static inline hashval_t hash (const value_type &);
+ static inline bool equal (const value_type &,
+ const compare_type &);
+};
+
+inline hashval_t
+innermost_loop_behavior_hash::hash (const value_type &e)
+{
+ hashval_t hash;
+
+ hash = iterative_hash_expr (e->base_address, 0);
+ hash = iterative_hash_expr (e->offset, hash);
+ hash = iterative_hash_expr (e->init, hash);
+ return iterative_hash_expr (e->step, hash);
+}
+
+inline bool
+innermost_loop_behavior_hash::equal (const value_type &e1,
+ const compare_type &e2)
+{
+ if ((e1->base_address && !e2->base_address)
+ || (!e1->base_address && e2->base_address)
+ || (!e1->offset && e2->offset)
+ || (e1->offset && !e2->offset)
+ || (!e1->init && e2->init)
+ || (e1->init && !e2->init)
+ || (!e1->step && e2->step)
+ || (e1->step && !e2->step))
+ return false;
+
+ if (e1->base_address && e2->base_address
+ && !operand_equal_p (e1->base_address, e2->base_address, 0))
+ return false;
+ if (e1->offset && e2->offset
+ && !operand_equal_p (e1->offset, e2->offset, 0))
+ return false;
+ if (e1->init && e2->init
+ && !operand_equal_p (e1->init, e2->init, 0))
+ return false;
+ if (e1->step && e2->step
+ && !operand_equal_p (e1->step, e2->step, 0))
+ return false;
+
+ return true;
+}
+
/* List of basic blocks in if-conversion-suitable order. */
static basic_block *ifc_bbs;
/* Apply more aggressive (extended) if-conversion if true. */
static bool aggressive_if_conv;
-/* Hash table to store references, DR pairs. */
-static hash_map<tree_operand_hash, data_reference_p> *ref_DR_map;
+/* Hash table to store <DR's innermost loop behavior, DR> pairs. */
+static hash_map<innermost_loop_behavior_hash,
+ data_reference_p> *innermost_DR_map;
-/* Hash table to store base reference, DR pairs. */
+/* Hash table to store <base reference, DR> pairs. */
static hash_map<tree_operand_hash, data_reference_p> *baseref_DR_map;
/* Structure used to predicate basic blocks. This is attached to the
@@ -613,17 +665,12 @@ hash_memrefs_baserefs_and_store_DRs_read_written_info
(data_reference_p a)
{
data_reference_p *master_dr, *base_master_dr;
- tree ref = DR_REF (a);
tree base_ref = DR_BASE_OBJECT (a);
+ innermost_loop_behavior *innermost = &DR_INNERMOST (a);
tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
bool exist1, exist2;
- while (TREE_CODE (ref) == COMPONENT_REF
- || TREE_CODE (ref) == IMAGPART_EXPR
- || TREE_CODE (ref) == REALPART_EXPR)
- ref = TREE_OPERAND (ref, 0);
-
- master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
+ master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
if (!exist1)
*master_dr = a;
@@ -685,21 +732,18 @@ ifcvt_memrefs_wont_trap (gimple *stmt,
vec<data_reference_p> drs)
data_reference_p *master_dr, *base_master_dr;
data_reference_p a = drs[gimple_uid (stmt) - 1];
- tree ref_base_a = DR_REF (a);
tree base = DR_BASE_OBJECT (a);
+ innermost_loop_behavior *innermost = &DR_INNERMOST (a);
gcc_assert (DR_STMT (a) == stmt);
+ gcc_assert (DR_BASE_ADDRESS (a) || DR_OFFSET (a)
+ || DR_INIT (a) || DR_STEP (a));
- while (TREE_CODE (ref_base_a) == COMPONENT_REF
- || TREE_CODE (ref_base_a) == IMAGPART_EXPR
- || TREE_CODE (ref_base_a) == REALPART_EXPR)
- ref_base_a = TREE_OPERAND (ref_base_a, 0);
+ master_dr = innermost_DR_map->get (innermost);
+ gcc_assert (master_dr != NULL);
- master_dr = ref_DR_map->get (ref_base_a);
base_master_dr = baseref_DR_map->get (base);
- gcc_assert (master_dr != NULL);
-
/* If a is unconditionally written to it doesn't trap. */
if (DR_W_UNCONDITIONALLY (*master_dr))
return true;
@@ -1228,13 +1272,16 @@ if_convertible_loop_p_1 (struct loop *loop,
data_reference_p dr;
- ref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
+ innermost_DR_map
+ = new hash_map<innermost_loop_behavior_hash, data_reference_p>;
baseref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
predicate_bbs (loop);
for (i = 0; refs->iterate (i, &dr); i++)
{
+ tree ref = DR_REF (dr);
+
dr->aux = XNEW (struct ifc_dr);
DR_BASE_W_UNCONDITIONALLY (dr) = false;
DR_RW_UNCONDITIONALLY (dr) = false;
@@ -1244,6 +1291,27 @@ if_convertible_loop_p_1 (struct loop *loop,
IFC_DR (dr)->base_w_predicate = boolean_false_node;
if (gimple_uid (DR_STMT (dr)) == 0)
gimple_set_uid (DR_STMT (dr), i + 1);
+
+ /* If DR doesn't have innermost loop behavior or it's a compound
+ memory reference, we synthesize its innermost loop behavior
+ for hashing. */
+ if (TREE_CODE (ref) == COMPONENT_REF
+ || TREE_CODE (ref) == IMAGPART_EXPR
+ || TREE_CODE (ref) == REALPART_EXPR
+ || !(DR_BASE_ADDRESS (dr) || DR_OFFSET (dr)
+ || DR_INIT (dr) || DR_STEP (dr)))
+ {
+ while (TREE_CODE (ref) == COMPONENT_REF
+ || TREE_CODE (ref) == IMAGPART_EXPR
+ || TREE_CODE (ref) == REALPART_EXPR)
+ ref = TREE_OPERAND (ref, 0);
+
+ DR_BASE_ADDRESS (dr) = ref;
+ DR_OFFSET (dr) = NULL;
+ DR_INIT (dr) = NULL;
+ DR_STEP (dr) = NULL;
+ DR_ALIGNED_TO (dr) = NULL;
+ }
hash_memrefs_baserefs_and_store_DRs_read_written_info (dr);
}
@@ -1338,8 +1406,8 @@ if_convertible_loop_p (struct loop *loop, bool
*any_mask_load_store)
free_data_refs (refs);
- delete ref_DR_map;
- ref_DR_map = NULL;
+ delete innermost_DR_map;
+ innermost_DR_map = NULL;
delete baseref_DR_map;
baseref_DR_map = NULL;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
new file mode 100644
index 0000000..02a6731
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-ifcvt-stats" { target *-*-*
} } */
+
+void foo (int a[], int b[])
+{
+ int i;
+ for (i = 0; i < 100; i++)
+ {
+ if (a[i] == 0)
+ a[i] = b[i]*4;
+ else
+ a[i] = b[i]*3;
+ }
+}
+
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr56625.c
b/gcc/testsuite/gcc.dg/vect/pr56625.c
new file mode 100644
index 0000000..b903be3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr56625.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+
+void foo (int a[], int b[])
+{
+ int i;
+ for (i = 0; i < 100; i++)
+ {
+ if (a[i] == 0)
+ a[i] = b[i]*4;
+ else
+ a[i] = b[i]*3;
+ }
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */