Re: Handle data dependence relations with different bases

2017-08-04 Thread Richard Biener
On Fri, Aug 4, 2017 at 11:28 AM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Thu, Jul 27, 2017 at 2:19 PM, Richard Sandiford
>>  wrote:
>>> Richard Sandiford  writes:
 Eric Botcazou  writes:
> [Sorry for missing the previous messages]
>
>> Thanks.  Just been retesting, and I think I must have forgotten
>> to include Ada last time.  It turns out that the patch causes a dg-scan
>> regression in gnat.dg/vect17.adb, because we now think that if the
>> array RECORD_TYPEs *do* alias in:
>>
>>procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>begin
>>   for I in Sarray'Range loop
>>  R(I) := X(I) + Y(I);
>>   end loop;
>>end;
>>
>> then the dependence distance must be zero.  Eric, does that hold true
>> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>
> Yes, I'd think so (even without the artificial RECORD_TYPE around
>> the arrays).

 Good!

>> 2017-06-07  Richard Sandiford  
>>
>> gcc/testsuite/
>> * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>> * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>> when X = R or Y = R.
>
> I think that you need to modify vect15 and vect16 the same way.

 Ah, yeah.  And doing that shows that I'd not handled safelen for
 DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.

 How does this look?  Tested on x86_64-linux-gnu both without the
 vectoriser changes and with the fixed vectoriser patch.
>>>
>>> Here's a version of the patch that handles safelen.  I split the
>>> handling out into a new function (vect_analyze_possibly_independent_ddr)
>>> since it was getting too big to do inline.
>>>
>>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> Ok.
>
> Thanks!
>
>> Did you check whether BB vectorization is affected?  See
>> vect_slp_analyze_instance_dependence
>> and friends.  It's quite conservative but given the prefetching change
>> I wonder if we need
>> to rule out DDR_COULD_BE_INDEPENDENT_P?
>
> I think it should be OK.  When DDR_COULD_BE_INDEPENDENT_P is set,
> we've effectively changed from DDR_ARE_DEPENDENT == chrec_dont_know
> to a conservatively-correct distance vector.  It looks like
> vect_slp_analyze_data_ref_dependence handles both cases in the
> same way (by returning true).

Yes.  Could be improved of course.

Thanks for double-checking.
Richard.

> Thanks,
> Richard
>
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Richard
>>>
>>>
>>> 2017-07-27  Richard Sandiford  
>>>
>>> gcc/
>>> * tree-data-ref.h (subscript): Add access_fn field.
>>> (data_dependence_relation): Add could_be_independent_p.
>>> (SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
>>> (same_access_functions): Move to tree-data-ref.c.
>>> * tree-data-ref.c (ref_contains_union_access_p): New function.
>>> (access_fn_component_p): Likewise.
>>> (access_fn_components_comparable_p): Likewise.
>>> (dr_analyze_indices): Add a reference to access_fn_component_p.
>>> (dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
>>> DR_ACCESS_FN.
>>> (constant_access_functions): Likewise.
>>> (add_other_self_distances): Likewise.
>>> (same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
>>> (initialize_data_dependence_relation): Use XCNEW and remove
>>> explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
>>> of access functions that have the same type.  Allow the
>>> subsequence to end with different bases in some circumstances.
>>> Record the chosen access functions in SUB_ACCESS_FN.
>>> (build_classic_dist_vector_1): Replace ddr_a and ddr_b with
>>> a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
>>> (subscript_dependence_tester_1): Likewise dra and drb.
>>> (build_classic_dist_vector): Update calls accordingly.
>>> (subscript_dependence_tester): Likewise.
>>> * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
>>> DDR_COULD_BE_INDEPENDENT_P.
>>> * tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
>>> comp_alias_ddrs instead of may_alias_ddrs.
>>> * tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
>>> New function.
>>> (vect_analyze_data_ref_dependence): Use it if
>>> DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
>>> distance vectors if that fails.
>>> (dependence_distance_ge_vf): New function.
>>> 

Re: Handle data dependence relations with different bases

2017-08-04 Thread Richard Sandiford
Richard Biener  writes:
> On Thu, Jul 27, 2017 at 2:19 PM, Richard Sandiford
>  wrote:
>> Richard Sandiford  writes:
>>> Eric Botcazou  writes:
 [Sorry for missing the previous messages]

> Thanks.  Just been retesting, and I think I must have forgotten
> to include Ada last time.  It turns out that the patch causes a dg-scan
> regression in gnat.dg/vect17.adb, because we now think that if the
> array RECORD_TYPEs *do* alias in:
>
>procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>begin
>   for I in Sarray'Range loop
>  R(I) := X(I) + Y(I);
>   end loop;
>end;
>
> then the dependence distance must be zero.  Eric, does that hold true
> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?

 Yes, I'd think so (even without the artificial RECORD_TYPE around
> the arrays).
>>>
>>> Good!
>>>
> 2017-06-07  Richard Sandiford  
>
> gcc/testsuite/
> * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
> * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
> when X = R or Y = R.

 I think that you need to modify vect15 and vect16 the same way.
>>>
>>> Ah, yeah.  And doing that shows that I'd not handled safelen for
>>> DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.
>>>
>>> How does this look?  Tested on x86_64-linux-gnu both without the
>>> vectoriser changes and with the fixed vectoriser patch.
>>
>> Here's a version of the patch that handles safelen.  I split the
>> handling out into a new function (vect_analyze_possibly_independent_ddr)
>> since it was getting too big to do inline.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> Ok.

Thanks!

> Did you check whether BB vectorization is affected?  See
> vect_slp_analyze_instance_dependence
> and friends.  It's quite conservative but given the prefetching change
> I wonder if we need
> to rule out DDR_COULD_BE_INDEPENDENT_P?

I think it should be OK.  When DDR_COULD_BE_INDEPENDENT_P is set,
we've effectively changed from DDR_ARE_DEPENDENT == chrec_dont_know
to a conservatively-correct distance vector.  It looks like
vect_slp_analyze_data_ref_dependence handles both cases in the
same way (by returning true).

Thanks,
Richard

>
> Thanks,
> Richard.
>
>> Thanks,
>> Richard
>>
>>
>> 2017-07-27  Richard Sandiford  
>>
>> gcc/
>> * tree-data-ref.h (subscript): Add access_fn field.
>> (data_dependence_relation): Add could_be_independent_p.
>> (SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
>> (same_access_functions): Move to tree-data-ref.c.
>> * tree-data-ref.c (ref_contains_union_access_p): New function.
>> (access_fn_component_p): Likewise.
>> (access_fn_components_comparable_p): Likewise.
>> (dr_analyze_indices): Add a reference to access_fn_component_p.
>> (dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
>> DR_ACCESS_FN.
>> (constant_access_functions): Likewise.
>> (add_other_self_distances): Likewise.
>> (same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
>> (initialize_data_dependence_relation): Use XCNEW and remove
>> explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
>> of access functions that have the same type.  Allow the
>> subsequence to end with different bases in some circumstances.
>> Record the chosen access functions in SUB_ACCESS_FN.
>> (build_classic_dist_vector_1): Replace ddr_a and ddr_b with
>> a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
>> (subscript_dependence_tester_1): Likewise dra and drb.
>> (build_classic_dist_vector): Update calls accordingly.
>> (subscript_dependence_tester): Likewise.
>> * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
>> DDR_COULD_BE_INDEPENDENT_P.
>> * tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
>> comp_alias_ddrs instead of may_alias_ddrs.
>> * tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
>> New function.
>> (vect_analyze_data_ref_dependence): Use it if
>> DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
>> distance vectors if that fails.
>> (dependence_distance_ge_vf): New function.
>> (vect_prune_runtime_alias_test_list): Use it.  Don't clear
>> LOOP_VINFO_MAY_ALIAS_DDRS.
>>
>> gcc/testsuite/
>> * gcc.dg/vect/vect-alias-check-3.c: New test.
>> * gcc.dg/vect/vect-alias-check-4.c: Likewise.
>> * gcc.dg/vect/vect-alias-check-5.c: 

Re: Handle data dependence relations with different bases

2017-08-04 Thread Richard Biener
On Thu, Jul 27, 2017 at 2:19 PM, Richard Sandiford
 wrote:
> Richard Sandiford  writes:
>> Eric Botcazou  writes:
>>> [Sorry for missing the previous messages]
>>>
 Thanks.  Just been retesting, and I think I must have forgotten
 to include Ada last time.  It turns out that the patch causes a dg-scan
 regression in gnat.dg/vect17.adb, because we now think that if the
 array RECORD_TYPEs *do* alias in:

procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
begin
   for I in Sarray'Range loop
  R(I) := X(I) + Y(I);
   end loop;
end;

 then the dependence distance must be zero.  Eric, does that hold true
 for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
 X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>>>
>>> Yes, I'd think so (even without the artificial RECORD_TYPE around the 
>>> arrays).
>>
>> Good!
>>
 2017-06-07  Richard Sandiford  

 gcc/testsuite/
 * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
 * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
 when X = R or Y = R.
>>>
>>> I think that you need to modify vect15 and vect16 the same way.
>>
>> Ah, yeah.  And doing that shows that I'd not handled safelen for
>> DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.
>>
>> How does this look?  Tested on x86_64-linux-gnu both without the
>> vectoriser changes and with the fixed vectoriser patch.
>
> Here's a version of the patch that handles safelen.  I split the
> handling out into a new function (vect_analyze_possibly_independent_ddr)
> since it was getting too big to do inline.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Ok.

Did you check whether BB vectorization is affected?  See
vect_slp_analyze_instance_dependence
and friends.  It's quite conservative but given the prefetching change
I wonder if we need
to rule out DDR_COULD_BE_INDEPENDENT_P?

Thanks,
Richard.

> Thanks,
> Richard
>
>
> 2017-07-27  Richard Sandiford  
>
> gcc/
> * tree-data-ref.h (subscript): Add access_fn field.
> (data_dependence_relation): Add could_be_independent_p.
> (SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
> (same_access_functions): Move to tree-data-ref.c.
> * tree-data-ref.c (ref_contains_union_access_p): New function.
> (access_fn_component_p): Likewise.
> (access_fn_components_comparable_p): Likewise.
> (dr_analyze_indices): Add a reference to access_fn_component_p.
> (dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
> DR_ACCESS_FN.
> (constant_access_functions): Likewise.
> (add_other_self_distances): Likewise.
> (same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
> (initialize_data_dependence_relation): Use XCNEW and remove
> explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
> of access functions that have the same type.  Allow the
> subsequence to end with different bases in some circumstances.
> Record the chosen access functions in SUB_ACCESS_FN.
> (build_classic_dist_vector_1): Replace ddr_a and ddr_b with
> a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
> (subscript_dependence_tester_1): Likewise dra and drb.
> (build_classic_dist_vector): Update calls accordingly.
> (subscript_dependence_tester): Likewise.
> * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
> DDR_COULD_BE_INDEPENDENT_P.
> * tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
> comp_alias_ddrs instead of may_alias_ddrs.
> * tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
> New function.
> (vect_analyze_data_ref_dependence): Use it if
> DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
> distance vectors if that fails.
> (dependence_distance_ge_vf): New function.
> (vect_prune_runtime_alias_test_list): Use it.  Don't clear
> LOOP_VINFO_MAY_ALIAS_DDRS.
>
> gcc/testsuite/
> * gcc.dg/vect/vect-alias-check-3.c: New test.
> * gcc.dg/vect/vect-alias-check-4.c: Likewise.
> * gcc.dg/vect/vect-alias-check-5.c: Likewise.
>
> Index: gcc/tree-data-ref.h
> ===
> --- gcc/tree-data-ref.h 2017-07-27 13:10:29.620045506 +0100
> +++ gcc/tree-data-ref.h 2017-07-27 13:10:33.023912613 +0100
> @@ -260,6 +260,9 @@ struct conflict_function
>
>  struct subscript
>  {
> +  /* The access functions of the two references.  */
> +  tree access_fn[2];
> +
>/* A description of the iterations for which the elements are
>   accessed twice.  

Re: Handle data dependence relations with different bases

2017-08-03 Thread Richard Sandiford
Ping

Richard Sandiford  writes:
> Richard Sandiford  writes:
>> Eric Botcazou  writes:
>>> [Sorry for missing the previous messages]
>>>
 Thanks.  Just been retesting, and I think I must have forgotten
 to include Ada last time.  It turns out that the patch causes a dg-scan
 regression in gnat.dg/vect17.adb, because we now think that if the
 array RECORD_TYPEs *do* alias in:
 
procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
begin
   for I in Sarray'Range loop
  R(I) := X(I) + Y(I);
   end loop;
end;
 
 then the dependence distance must be zero.  Eric, does that hold true
 for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
 X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>>>
>>> Yes, I'd think so (even without the artificial RECORD_TYPE around the 
>>> arrays).
>>
>> Good!
>>
 2017-06-07  Richard Sandiford  
 
 gcc/testsuite/
* gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
* gnat.dg/vect17.adb (Add): Create a dependence distance of 1
when X = R or Y = R.
>>>
>>> I think that you need to modify vect15 and vect16 the same way.
>>
>> Ah, yeah.  And doing that shows that I'd not handled safelen for
>> DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.
>>
>> How does this look?  Tested on x86_64-linux-gnu both without the
>> vectoriser changes and with the fixed vectoriser patch.
>
> Here's a version of the patch that handles safelen.  I split the
> handling out into a new function (vect_analyze_possibly_independent_ddr)
> since it was getting too big to do inline.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> Thanks,
> Richard
>
>
> 2017-07-27  Richard Sandiford  
>
> gcc/
>   * tree-data-ref.h (subscript): Add access_fn field.
>   (data_dependence_relation): Add could_be_independent_p.
>   (SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
>   (same_access_functions): Move to tree-data-ref.c.
>   * tree-data-ref.c (ref_contains_union_access_p): New function.
>   (access_fn_component_p): Likewise.
>   (access_fn_components_comparable_p): Likewise.
>   (dr_analyze_indices): Add a reference to access_fn_component_p.
>   (dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
>   DR_ACCESS_FN.
>   (constant_access_functions): Likewise.
>   (add_other_self_distances): Likewise.
>   (same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
>   (initialize_data_dependence_relation): Use XCNEW and remove
>   explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
>   of access functions that have the same type.  Allow the
>   subsequence to end with different bases in some circumstances.
>   Record the chosen access functions in SUB_ACCESS_FN.
>   (build_classic_dist_vector_1): Replace ddr_a and ddr_b with
>   a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
>   (subscript_dependence_tester_1): Likewise dra and drb.
>   (build_classic_dist_vector): Update calls accordingly.
>   (subscript_dependence_tester): Likewise.
>   * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
>   DDR_COULD_BE_INDEPENDENT_P.
>   * tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
>   comp_alias_ddrs instead of may_alias_ddrs.
>   * tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
>   New function.
>   (vect_analyze_data_ref_dependence): Use it if
>   DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
>   distance vectors if that fails.
>   (dependence_distance_ge_vf): New function.
>   (vect_prune_runtime_alias_test_list): Use it.  Don't clear
>   LOOP_VINFO_MAY_ALIAS_DDRS.
>
> gcc/testsuite/
>   * gcc.dg/vect/vect-alias-check-3.c: New test.
>   * gcc.dg/vect/vect-alias-check-4.c: Likewise.
>   * gcc.dg/vect/vect-alias-check-5.c: Likewise.
>
> Index: gcc/tree-data-ref.h
> ===
> --- gcc/tree-data-ref.h   2017-07-27 13:10:29.620045506 +0100
> +++ gcc/tree-data-ref.h   2017-07-27 13:10:33.023912613 +0100
> @@ -260,6 +260,9 @@ struct conflict_function
>  
>  struct subscript
>  {
> +  /* The access functions of the two references.  */
> +  tree access_fn[2];
> +
>/* A description of the iterations for which the elements are
>   accessed twice.  */
>conflict_function *conflicting_iterations_in_a;
> @@ -278,6 +281,7 @@ struct subscript
>  
>  typedef struct subscript *subscript_p;
>  
> +#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
>  #define SUB_CONFLICTS_IN_A(SUB) (SUB)->conflicting_iterations_in_a
>  #define SUB_CONFLICTS_IN_B(SUB) 

Re: Handle data dependence relations with different bases

2017-07-27 Thread Richard Sandiford
Richard Sandiford  writes:
> Eric Botcazou  writes:
>> [Sorry for missing the previous messages]
>>
>>> Thanks.  Just been retesting, and I think I must have forgotten
>>> to include Ada last time.  It turns out that the patch causes a dg-scan
>>> regression in gnat.dg/vect17.adb, because we now think that if the
>>> array RECORD_TYPEs *do* alias in:
>>> 
>>>procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>>begin
>>>   for I in Sarray'Range loop
>>>  R(I) := X(I) + Y(I);
>>>   end loop;
>>>end;
>>> 
>>> then the dependence distance must be zero.  Eric, does that hold true
>>> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
>>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>>
>> Yes, I'd think so (even without the artificial RECORD_TYPE around the 
>> arrays).
>
> Good!
>
>>> 2017-06-07  Richard Sandiford  
>>> 
>>> gcc/testsuite/
>>> * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>>> * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>>> when X = R or Y = R.
>>
>> I think that you need to modify vect15 and vect16 the same way.
>
> Ah, yeah.  And doing that shows that I'd not handled safelen for
> DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.
>
> How does this look?  Tested on x86_64-linux-gnu both without the
> vectoriser changes and with the fixed vectoriser patch.

Here's a version of the patch that handles safelen.  I split the
handling out into a new function (vect_analyze_possibly_independent_ddr)
since it was getting too big to do inline.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


2017-07-27  Richard Sandiford  

gcc/
* tree-data-ref.h (subscript): Add access_fn field.
(data_dependence_relation): Add could_be_independent_p.
(SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
(same_access_functions): Move to tree-data-ref.c.
* tree-data-ref.c (ref_contains_union_access_p): New function.
(access_fn_component_p): Likewise.
(access_fn_components_comparable_p): Likewise.
(dr_analyze_indices): Add a reference to access_fn_component_p.
(dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
DR_ACCESS_FN.
(constant_access_functions): Likewise.
(add_other_self_distances): Likewise.
(same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
(initialize_data_dependence_relation): Use XCNEW and remove
explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
of access functions that have the same type.  Allow the
subsequence to end with different bases in some circumstances.
Record the chosen access functions in SUB_ACCESS_FN.
(build_classic_dist_vector_1): Replace ddr_a and ddr_b with
a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
(subscript_dependence_tester_1): Likewise dra and drb.
(build_classic_dist_vector): Update calls accordingly.
(subscript_dependence_tester): Likewise.
* tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
DDR_COULD_BE_INDEPENDENT_P.
* tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
comp_alias_ddrs instead of may_alias_ddrs.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
New function.
(vect_analyze_data_ref_dependence): Use it if
DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
distance vectors if that fails.
(dependence_distance_ge_vf): New function.
(vect_prune_runtime_alias_test_list): Use it.  Don't clear
LOOP_VINFO_MAY_ALIAS_DDRS.

gcc/testsuite/
* gcc.dg/vect/vect-alias-check-3.c: New test.
* gcc.dg/vect/vect-alias-check-4.c: Likewise.
* gcc.dg/vect/vect-alias-check-5.c: Likewise.

Index: gcc/tree-data-ref.h
===
--- gcc/tree-data-ref.h 2017-07-27 13:10:29.620045506 +0100
+++ gcc/tree-data-ref.h 2017-07-27 13:10:33.023912613 +0100
@@ -260,6 +260,9 @@ struct conflict_function
 
 struct subscript
 {
+  /* The access functions of the two references.  */
+  tree access_fn[2];
+
   /* A description of the iterations for which the elements are
  accessed twice.  */
   conflict_function *conflicting_iterations_in_a;
@@ -278,6 +281,7 @@ struct subscript
 
 typedef struct subscript *subscript_p;
 
+#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
 #define SUB_CONFLICTS_IN_A(SUB) (SUB)->conflicting_iterations_in_a
 #define SUB_CONFLICTS_IN_B(SUB) (SUB)->conflicting_iterations_in_b
 #define SUB_LAST_CONFLICT(SUB) (SUB)->last_conflict
@@ -333,6 +337,33 @@ struct data_dependence_relation
   /* Set to true when the dependence relation is on the same data
  

Re: Handle data dependence relations with different bases

2017-07-07 Thread Eric Botcazou
> Ah, yeah.  And doing that shows that I'd not handled safelen for
> DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.
> 
> How does this look?  Tested on x86_64-linux-gnu both without the
> vectoriser changes and with the fixed vectoriser patch.
> 
> Thanks,
> Richard
> 
> 
> 2017-07-07  Richard Sandiford  
> 
> gcc/testsuite/
>   * gnat.dg/vect15.ads (Sarray): Increase range to 1 .. 5.
>   * gnat.dg/vect16.ads (Sarray): Likewise.
>   * gnat.dg/vect17.ads (Sarray): Likewise.
>   * gnat.dg/vect15.adb (Add): Create a dependence distance of 1.
>   * gnat.dg/vect16.adb (Add): Likewise.
>   * gnat.dg/vect17.adb (Add): Likewise.

OK, thanks.

-- 
Eric Botcazou


Re: Handle data dependence relations with different bases

2017-07-07 Thread Richard Sandiford
Eric Botcazou  writes:
> [Sorry for missing the previous messages]
>
>> Thanks.  Just been retesting, and I think I must have forgotten
>> to include Ada last time.  It turns out that the patch causes a dg-scan
>> regression in gnat.dg/vect17.adb, because we now think that if the
>> array RECORD_TYPEs *do* alias in:
>> 
>>procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>begin
>>   for I in Sarray'Range loop
>>  R(I) := X(I) + Y(I);
>>   end loop;
>>end;
>> 
>> then the dependence distance must be zero.  Eric, does that hold true
>> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>
> Yes, I'd think so (even without the artificial RECORD_TYPE around the arrays).

Good!

>> 2017-06-07  Richard Sandiford  
>> 
>> gcc/testsuite/
>>  * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>>  * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>>  when X = R or Y = R.
>
> I think that you need to modify vect15 and vect16 the same way.

Ah, yeah.  And doing that shows that I'd not handled safelen for
DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.

How does this look?  Tested on x86_64-linux-gnu both without the
vectoriser changes and with the fixed vectoriser patch.

Thanks,
Richard


2017-07-07  Richard Sandiford  

gcc/testsuite/
* gnat.dg/vect15.ads (Sarray): Increase range to 1 .. 5.
* gnat.dg/vect16.ads (Sarray): Likewise.
* gnat.dg/vect17.ads (Sarray): Likewise.
* gnat.dg/vect15.adb (Add): Create a dependence distance of 1.
* gnat.dg/vect16.adb (Add): Likewise.
* gnat.dg/vect17.adb (Add): Likewise.

Index: gcc/testsuite/gnat.dg/vect15.ads
===
--- gcc/testsuite/gnat.dg/vect15.ads2015-10-14 14:58:56.0 +0100
+++ gcc/testsuite/gnat.dg/vect15.ads2017-07-07 13:12:51.509540701 +0100
@@ -1,6 +1,6 @@
 package Vect15 is
 
-   type Sarray is array (1 .. 4) of Long_Float;
+   type Sarray is array (1 .. 5) of Long_Float;
for Sarray'Alignment use 16;
 
procedure Add (X, Y : Sarray; R : out Sarray);
Index: gcc/testsuite/gnat.dg/vect16.ads
===
--- gcc/testsuite/gnat.dg/vect16.ads2015-10-14 14:58:56.0 +0100
+++ gcc/testsuite/gnat.dg/vect16.ads2017-07-07 13:12:51.511540636 +0100
@@ -1,6 +1,6 @@
 package Vect16 is
 
-   type Sarray is array (1 .. 4) of Long_Float;
+   type Sarray is array (1 .. 5) of Long_Float;
for Sarray'Alignment use 16;
 
procedure Add_Sub (X, Y : Sarray; R,S : out Sarray);
Index: gcc/testsuite/gnat.dg/vect17.ads
===
--- gcc/testsuite/gnat.dg/vect17.ads2017-06-07 22:13:29.692531472 +0100
+++ gcc/testsuite/gnat.dg/vect17.ads2017-07-07 13:12:51.514540538 +0100
@@ -1,6 +1,6 @@
 package Vect17 is
 
-   type Sarray is array (1 .. 4) of Long_Float;
+   type Sarray is array (1 .. 5) of Long_Float;
for Sarray'Alignment use 16;
 
procedure Add (X, Y : aliased Sarray; R : aliased out Sarray);
Index: gcc/testsuite/gnat.dg/vect15.adb
===
--- gcc/testsuite/gnat.dg/vect15.adb2015-10-14 14:58:56.0 +0100
+++ gcc/testsuite/gnat.dg/vect15.adb2017-07-07 13:12:51.509540701 +0100
@@ -5,8 +5,9 @@ package body Vect15 is
 
procedure Add (X, Y : Sarray; R : out Sarray) is
begin
-  for I in Sarray'Range loop
- R(I) := X(I) + Y(I);
+  R(1) := X(5) + Y(5);
+  for I in 1 .. 4 loop
+ R(I + 1) := X(I) + Y(I);
   end loop;
end;
 
Index: gcc/testsuite/gnat.dg/vect16.adb
===
--- gcc/testsuite/gnat.dg/vect16.adb2015-10-14 14:58:56.0 +0100
+++ gcc/testsuite/gnat.dg/vect16.adb2017-07-07 13:12:51.510540669 +0100
@@ -5,9 +5,11 @@ package body Vect16 is
 
procedure Add_Sub (X, Y : Sarray; R,S : out Sarray) is
begin
-  for I in Sarray'Range loop
- R(I) := X(I) + Y(I);
- S(I) := X(I) - Y(I);
+  R(1) := X(5) + Y(5);
+  S(1) := X(5) - Y(5);
+  for I in 1 .. 4 loop
+ R(I + 1) := X(I) + Y(I);
+ S(I + 1) := X(I) - Y(I);
   end loop;
end;
 
Index: gcc/testsuite/gnat.dg/vect17.adb
===
--- gcc/testsuite/gnat.dg/vect17.adb2017-06-07 22:13:29.692531472 +0100
+++ gcc/testsuite/gnat.dg/vect17.adb2017-07-07 13:12:51.512540603 +0100
@@ -5,8 +5,9 @@ package body Vect17 is
 
procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
begin
-  for I in Sarray'Range loop
- R(I) := X(I) + Y(I);
+  R(1) := X(5) + Y(5);
+  for I in 1 .. 4 loop
+ R(I + 

Re: Handle data dependence relations with different bases

2017-07-04 Thread Eric Botcazou
[Sorry for missing the previous messages]

> Thanks.  Just been retesting, and I think I must have forgotten
> to include Ada last time.  It turns out that the patch causes a dg-scan
> regression in gnat.dg/vect17.adb, because we now think that if the
> array RECORD_TYPEs *do* alias in:
> 
>procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>begin
>   for I in Sarray'Range loop
>  R(I) := X(I) + Y(I);
>   end loop;
>end;
> 
> then the dependence distance must be zero.  Eric, does that hold true
> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?

Yes, I'd think so (even without the artificial RECORD_TYPE around the arrays).

> 2017-06-07  Richard Sandiford  
> 
> gcc/testsuite/
>   * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>   * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>   when X = R or Y = R.

I think that you need to modify vect15 and vect16 the same way.

-- 
Eric Botcazou


[PING*3, Ada] Re: Handle data dependence relations with different bases

2017-06-29 Thread Richard Sandiford
Ping*3

Richard Sandiford  writes:
> Ping*2
>
> Richard Sandiford  writes:
>> Ping for this Ada patch/question.
>>
>> Richard Sandiford  writes:
>>> Richard Biener  writes:
>> How does this look?  Changes since v1:
>>
>> - Added access_fn_component_p to check for valid access function
>> components.
>>
>> - Added access_fn_components_comparable_p instead of using
>>   types_compatibloe_p directly.
>>
>> - Added more commentary.
>>
>> - Added local structures to represent the sequence, so that it's
>>   more obvious which variables are temporaries and which aren't.
>>
>> - Added the test above to vect-alias-check-3.c.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.

 This is ok.
>>>
>>> Thanks.  Just been retesting, and I think I must have forgotten
>>> to include Ada last time.  It turns out that the patch causes a dg-scan
>>> regression in gnat.dg/vect17.adb, because we now think that if the
>>> array RECORD_TYPEs *do* alias in:
>>>
>>>procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>>begin
>>>   for I in Sarray'Range loop
>>>  R(I) := X(I) + Y(I);
>>>   end loop;
>>>end;
>>>
>>> then the dependence distance must be zero.  Eric, does that hold true
>>> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
>>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?  Or are
>>> the arrays allowed to overlap by an arbitrary number of indices?
>>>
>>> If the assumption is correct, is the patch below OK?
>>>
>>> Thanks,
>>> Richard
>>>
>>>
>>> 2017-06-07  Richard Sandiford  
>>>
>>> gcc/testsuite/
>>> * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>>> * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>>> when X = R or Y = R.
>>>
>>> Index: gcc/testsuite/gnat.dg/vect17.ads
>>> ===
>>> --- gcc/testsuite/gnat.dg/vect17.ads2015-10-14 14:58:56.0 
>>> +0100
>>> +++ gcc/testsuite/gnat.dg/vect17.ads2017-06-07 22:10:24.796368118 
>>> +0100
>>> @@ -1,6 +1,6 @@
>>>  package Vect17 is
>>>  
>>> -   type Sarray is array (1 .. 4) of Long_Float;
>>> +   type Sarray is array (1 .. 5) of Long_Float;
>>> for Sarray'Alignment use 16;
>>>  
>>> procedure Add (X, Y : aliased Sarray; R : aliased out Sarray);
>>> Index: gcc/testsuite/gnat.dg/vect17.adb
>>> ===
>>> --- gcc/testsuite/gnat.dg/vect17.adb2015-10-14 14:58:56.0 
>>> +0100
>>> +++ gcc/testsuite/gnat.dg/vect17.adb2017-06-07 22:10:24.796368118 
>>> +0100
>>> @@ -5,8 +5,9 @@ package body Vect17 is
>>>  
>>> procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>> begin
>>> -  for I in Sarray'Range loop
>>> - R(I) := X(I) + Y(I);
>>> +  R(1) := X(5) + Y(5);
>>> +  for I in 1 .. 4 loop
>>> + R(I + 1) := X(I) + Y(I);
>>>end loop;
>>> end;
>>>  


[PING*2, Ada] Re: Handle data dependence relations with different bases

2017-06-22 Thread Richard Sandiford
Ping*2

Richard Sandiford  writes:
> Ping for this Ada patch/question.
>
> Richard Sandiford  writes:
>> Richard Biener  writes:
> How does this look?  Changes since v1:
>
> - Added access_fn_component_p to check for valid access function
> components.
>
> - Added access_fn_components_comparable_p instead of using
>   types_compatibloe_p directly.
>
> - Added more commentary.
>
> - Added local structures to represent the sequence, so that it's
>   more obvious which variables are temporaries and which aren't.
>
> - Added the test above to vect-alias-check-3.c.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.
>>>
>>> This is ok.
>>
>> Thanks.  Just been retesting, and I think I must have forgotten
>> to include Ada last time.  It turns out that the patch causes a dg-scan
>> regression in gnat.dg/vect17.adb, because we now think that if the
>> array RECORD_TYPEs *do* alias in:
>>
>>procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>begin
>>   for I in Sarray'Range loop
>>  R(I) := X(I) + Y(I);
>>   end loop;
>>end;
>>
>> then the dependence distance must be zero.  Eric, does that hold true
>> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?  Or are
>> the arrays allowed to overlap by an arbitrary number of indices?
>>
>> If the assumption is correct, is the patch below OK?
>>
>> Thanks,
>> Richard
>>
>>
>> 2017-06-07  Richard Sandiford  
>>
>> gcc/testsuite/
>>  * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>>  * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>>  when X = R or Y = R.
>>
>> Index: gcc/testsuite/gnat.dg/vect17.ads
>> ===
>> --- gcc/testsuite/gnat.dg/vect17.ads 2015-10-14 14:58:56.0 +0100
>> +++ gcc/testsuite/gnat.dg/vect17.ads 2017-06-07 22:10:24.796368118 +0100
>> @@ -1,6 +1,6 @@
>>  package Vect17 is
>>  
>> -   type Sarray is array (1 .. 4) of Long_Float;
>> +   type Sarray is array (1 .. 5) of Long_Float;
>> for Sarray'Alignment use 16;
>>  
>> procedure Add (X, Y : aliased Sarray; R : aliased out Sarray);
>> Index: gcc/testsuite/gnat.dg/vect17.adb
>> ===
>> --- gcc/testsuite/gnat.dg/vect17.adb 2015-10-14 14:58:56.0 +0100
>> +++ gcc/testsuite/gnat.dg/vect17.adb 2017-06-07 22:10:24.796368118 +0100
>> @@ -5,8 +5,9 @@ package body Vect17 is
>>  
>> procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>> begin
>> -  for I in Sarray'Range loop
>> - R(I) := X(I) + Y(I);
>> +  R(1) := X(5) + Y(5);
>> +  for I in 1 .. 4 loop
>> + R(I + 1) := X(I) + Y(I);
>>end loop;
>> end;
>>  


[PING, Ada] Re: Handle data dependence relations with different bases

2017-06-15 Thread Richard Sandiford
Ping for this Ada patch/question.

Richard Sandiford  writes:
> Richard Biener  writes:
 How does this look?  Changes since v1:

 - Added access_fn_component_p to check for valid access function 
 components.

 - Added access_fn_components_comparable_p instead of using
   types_compatibloe_p directly.

 - Added more commentary.

 - Added local structures to represent the sequence, so that it's
   more obvious which variables are temporaries and which aren't.

 - Added the test above to vect-alias-check-3.c.

 Tested on aarch64-linux-gnu and x86_64-linux-gnu.
>>
>> This is ok.
>
> Thanks.  Just been retesting, and I think I must have forgotten
> to include Ada last time.  It turns out that the patch causes a dg-scan
> regression in gnat.dg/vect17.adb, because we now think that if the
> array RECORD_TYPEs *do* alias in:
>
>procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>begin
>   for I in Sarray'Range loop
>  R(I) := X(I) + Y(I);
>   end loop;
>end;
>
> then the dependence distance must be zero.  Eric, does that hold true
> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?  Or are
> the arrays allowed to overlap by an arbitrary number of indices?
>
> If the assumption is correct, is the patch below OK?
>
> Thanks,
> Richard
>
>
> 2017-06-07  Richard Sandiford  
>
> gcc/testsuite/
>   * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>   * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>   when X = R or Y = R.
>
> Index: gcc/testsuite/gnat.dg/vect17.ads
> ===
> --- gcc/testsuite/gnat.dg/vect17.ads  2015-10-14 14:58:56.0 +0100
> +++ gcc/testsuite/gnat.dg/vect17.ads  2017-06-07 22:10:24.796368118 +0100
> @@ -1,6 +1,6 @@
>  package Vect17 is
>  
> -   type Sarray is array (1 .. 4) of Long_Float;
> +   type Sarray is array (1 .. 5) of Long_Float;
> for Sarray'Alignment use 16;
>  
> procedure Add (X, Y : aliased Sarray; R : aliased out Sarray);
> Index: gcc/testsuite/gnat.dg/vect17.adb
> ===
> --- gcc/testsuite/gnat.dg/vect17.adb  2015-10-14 14:58:56.0 +0100
> +++ gcc/testsuite/gnat.dg/vect17.adb  2017-06-07 22:10:24.796368118 +0100
> @@ -5,8 +5,9 @@ package body Vect17 is
>  
> procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
> begin
> -  for I in Sarray'Range loop
> - R(I) := X(I) + Y(I);
> +  R(1) := X(5) + Y(5);
> +  for I in 1 .. 4 loop
> + R(I + 1) := X(I) + Y(I);
>end loop;
> end;
>  


Re: Handle data dependence relations with different bases

2017-06-07 Thread Richard Sandiford
Richard Biener  writes:
>>> How does this look?  Changes since v1:
>>>
>>> - Added access_fn_component_p to check for valid access function components.
>>>
>>> - Added access_fn_components_comparable_p instead of using
>>>   types_compatibloe_p directly.
>>>
>>> - Added more commentary.
>>>
>>> - Added local structures to represent the sequence, so that it's
>>>   more obvious which variables are temporaries and which aren't.
>>>
>>> - Added the test above to vect-alias-check-3.c.
>>>
>>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.
>
> This is ok.

Thanks.  Just been retesting, and I think I must have forgotten
to include Ada last time.  It turns out that the patch causes a dg-scan
regression in gnat.dg/vect17.adb, because we now think that if the
array RECORD_TYPEs *do* alias in:

   procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
   begin
  for I in Sarray'Range loop
 R(I) := X(I) + Y(I);
  end loop;
   end;

then the dependence distance must be zero.  Eric, does that hold true
for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?  Or are
the arrays allowed to overlap by an arbitrary number of indices?

If the assumption is correct, is the patch below OK?

Thanks,
Richard


2017-06-07  Richard Sandiford  

gcc/testsuite/
* gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
* gnat.dg/vect17.adb (Add): Create a dependence distance of 1
when X = R or Y = R.

Index: gcc/testsuite/gnat.dg/vect17.ads
===
--- gcc/testsuite/gnat.dg/vect17.ads2015-10-14 14:58:56.0 +0100
+++ gcc/testsuite/gnat.dg/vect17.ads2017-06-07 22:10:24.796368118 +0100
@@ -1,6 +1,6 @@
 package Vect17 is
 
-   type Sarray is array (1 .. 4) of Long_Float;
+   type Sarray is array (1 .. 5) of Long_Float;
for Sarray'Alignment use 16;
 
procedure Add (X, Y : aliased Sarray; R : aliased out Sarray);
Index: gcc/testsuite/gnat.dg/vect17.adb
===
--- gcc/testsuite/gnat.dg/vect17.adb2015-10-14 14:58:56.0 +0100
+++ gcc/testsuite/gnat.dg/vect17.adb2017-06-07 22:10:24.796368118 +0100
@@ -5,8 +5,9 @@ package body Vect17 is
 
procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
begin
-  for I in Sarray'Range loop
- R(I) := X(I) + Y(I);
+  R(1) := X(5) + Y(5);
+  for I in 1 .. 4 loop
+ R(I + 1) := X(I) + Y(I);
   end loop;
end;
 


Re: Handle data dependence relations with different bases

2017-06-07 Thread Richard Biener
On Wed, May 31, 2017 at 8:56 AM, Richard Sandiford
 wrote:
> Ping
>
> Richard Sandiford  writes:
>> Richard Biener  writes:
>>> On Thu, May 4, 2017 at 7:21 PM, Richard Sandiford
>>>  wrote:
 Richard Biener  writes:
> On Thu, May 4, 2017 at 2:12 PM, Richard Biener
>  wrote:
>> On Wed, May 3, 2017 at 10:00 AM, Richard Sandiford
>>  wrote:
>>> This patch tries to calculate conservatively-correct distance
>>> vectors for two references whose base addresses are not the same.
>>> It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence
>>> isn't guaranteed to occur.
>>>
>>> The motivating example is:
>>>
>>>   struct s { int x[8]; };
>>>   void
>>>   f (struct s *a, struct s *b)
>>>   {
>>> for (int i = 0; i < 8; ++i)
>>>   a->x[i] += b->x[i];
>>>   }
>>>
>>> in which the "a" and "b" accesses are either independent or have a
>>> dependence distance of 0 (assuming -fstrict-aliasing).  Neither case
>>> prevents vectorisation, so we can vectorise without an alias check.
>>>
>>> I'd originally wanted to do the same thing for arrays as well, e.g.:
>>>
>>>   void
>>>   f (int a[][8], struct b[][8])
>>>   {
>>> for (int i = 0; i < 8; ++i)
>>>   a[0][i] += b[0][i];
>>>   }
>>>
>>> I think this is valid because C11 6.7.6.2/6 says:
>>>
>>>   For two array types to be compatible, both shall have compatible
>>>   element types, and if both size specifiers are present, and are
>>>   integer constant expressions, then both size specifiers shall have
>>>   the same constant value.
>>>
>>> So if we access an array through an int (*)[8], it must have type X[8]
>>> or X[], where X is compatible with int.  It doesn't seem possible in
>>> either case for "a[0]" and "b[0]" to overlap when "a != b".
>>>
>>> However, Richard B said that (at least in gimple) we support arbitrary
>>> overlap of arrays and allow arrays to be accessed with different
>>> dimensionality.  There are examples of this in PR50067.  I've therefore
>>> only handled references that end in a structure field access.
>>>
>>> There are two ways of handling these dependences in the vectoriser:
>>> use them to limit VF, or check at runtime as before.  I've gone for
>>> the approach of checking at runtime if we can, to avoid limiting VF
>>> unnecessarily.  We still fall back to a VF cap when runtime checks
>>> aren't allowed.
>>>
>>> The patch tests whether we queued an alias check with a dependence
>>> distance of X and then picked a VF <= X, in which case it's safe to
>>> drop the alias check.  Since vect_prune_runtime_alias_check_list can
>>> be called twice with different VF for the same loop, it's no longer
>>> safe to clear may_alias_ddrs on exit.  Instead we should use
>>> comp_alias_ddrs to check whether versioning is necessary.
>>>
>>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> You seem to do your "fancy" thing but also later compute the old
>> base equality anyway (for same_base_p).  It looks to me for this
>> case the new fancy code can be simply skipped, keeping num_dimensions
>> as before?
>>
>> +  /* Try to approach equal type sizes.  */
>> +  if (!COMPLETE_TYPE_P (type_a)
>> + || !COMPLETE_TYPE_P (type_b)
>> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
>> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
>> +   break;
>>
>> ah, interesting idea to avoid a quadratic search.  Note that you should
>> conservatively handle both BIT_FIELD_REF and VIEW_CONVERT_EXPR
>> as they are used for type-punning.

 All the component refs here should be REALPART_EXPRs, IMAGPART_EXPRs,
 ARRAY_REFs or COMPONENT_REFs of structures, since that's all that
 dr_analyze_indices allows, so I think we safe in terms of the tree codes.
>>>
>>> Yeah.  I think we need to document that we should have a 1:1 match here.
>>
>> OK, I added that to the comments and also added an access_fn_component_p
>> that we can assert on.
>>
>> I see nonoverlapping_component_refs_of_decl_p should simply skip
>> ARRAY_REFs - but I also see there:
>>
>>   /* ??? We cannot simply use the type of operand #0 of the refs here
>>  as the Fortran compiler smuggles type punning into 
>> COMPONENT_REFs
>>  for common blocks instead of using unions like everyone else.  
>> */
>>   tree type1 = DECL_CONTEXT (field1);
>>   tree type2 = DECL_CONTEXT (field2);
>>
>> so you probably can't simply use TREE_TYPE (outer_ref) for type 

Re: Handle data dependence relations with different bases

2017-05-31 Thread Richard Sandiford
Ping

Richard Sandiford  writes:
> Richard Biener  writes:
>> On Thu, May 4, 2017 at 7:21 PM, Richard Sandiford
>>  wrote:
>>> Richard Biener  writes:
 On Thu, May 4, 2017 at 2:12 PM, Richard Biener
  wrote:
> On Wed, May 3, 2017 at 10:00 AM, Richard Sandiford
>  wrote:
>> This patch tries to calculate conservatively-correct distance
>> vectors for two references whose base addresses are not the same.
>> It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence
>> isn't guaranteed to occur.
>>
>> The motivating example is:
>>
>>   struct s { int x[8]; };
>>   void
>>   f (struct s *a, struct s *b)
>>   {
>> for (int i = 0; i < 8; ++i)
>>   a->x[i] += b->x[i];
>>   }
>>
>> in which the "a" and "b" accesses are either independent or have a
>> dependence distance of 0 (assuming -fstrict-aliasing).  Neither case
>> prevents vectorisation, so we can vectorise without an alias check.
>>
>> I'd originally wanted to do the same thing for arrays as well, e.g.:
>>
>>   void
>>   f (int a[][8], struct b[][8])
>>   {
>> for (int i = 0; i < 8; ++i)
>>   a[0][i] += b[0][i];
>>   }
>>
>> I think this is valid because C11 6.7.6.2/6 says:
>>
>>   For two array types to be compatible, both shall have compatible
>>   element types, and if both size specifiers are present, and are
>>   integer constant expressions, then both size specifiers shall have
>>   the same constant value.
>>
>> So if we access an array through an int (*)[8], it must have type X[8]
>> or X[], where X is compatible with int.  It doesn't seem possible in
>> either case for "a[0]" and "b[0]" to overlap when "a != b".
>>
>> However, Richard B said that (at least in gimple) we support arbitrary
>> overlap of arrays and allow arrays to be accessed with different
>> dimensionality.  There are examples of this in PR50067.  I've therefore
>> only handled references that end in a structure field access.
>>
>> There are two ways of handling these dependences in the vectoriser:
>> use them to limit VF, or check at runtime as before.  I've gone for
>> the approach of checking at runtime if we can, to avoid limiting VF
>> unnecessarily.  We still fall back to a VF cap when runtime checks
>> aren't allowed.
>>
>> The patch tests whether we queued an alias check with a dependence
>> distance of X and then picked a VF <= X, in which case it's safe to
>> drop the alias check.  Since vect_prune_runtime_alias_check_list can
>> be called twice with different VF for the same loop, it's no longer
>> safe to clear may_alias_ddrs on exit.  Instead we should use
>> comp_alias_ddrs to check whether versioning is necessary.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> You seem to do your "fancy" thing but also later compute the old
> base equality anyway (for same_base_p).  It looks to me for this
> case the new fancy code can be simply skipped, keeping num_dimensions
> as before?
>
> +  /* Try to approach equal type sizes.  */
> +  if (!COMPLETE_TYPE_P (type_a)
> + || !COMPLETE_TYPE_P (type_b)
> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
> +   break;
>
> ah, interesting idea to avoid a quadratic search.  Note that you should
> conservatively handle both BIT_FIELD_REF and VIEW_CONVERT_EXPR
> as they are used for type-punning.
>>>
>>> All the component refs here should be REALPART_EXPRs, IMAGPART_EXPRs,
>>> ARRAY_REFs or COMPONENT_REFs of structures, since that's all that
>>> dr_analyze_indices allows, so I think we safe in terms of the tree codes.
>>
>> Yeah.  I think we need to document that we should have a 1:1 match here.
>
> OK, I added that to the comments and also added an access_fn_component_p
> that we can assert on.
>
> I see nonoverlapping_component_refs_of_decl_p should simply skip
> ARRAY_REFs - but I also see there:
>
>   /* ??? We cannot simply use the type of operand #0 of the refs here
>  as the Fortran compiler smuggles type punning into COMPONENT_REFs
>  for common blocks instead of using unions like everyone else.  */
>   tree type1 = DECL_CONTEXT (field1);
>   tree type2 = DECL_CONTEXT (field2);
>
> so you probably can't simply use TREE_TYPE (outer_ref) for type 
> compatibility.
> You also may not use types_compatible_p here as for LTO that is _way_ too
> lax for aggregates.  The above uses
>
>   /* We cannot disambiguate fields in a union or 

Re: Handle data dependence relations with different bases

2017-05-18 Thread Richard Sandiford
Richard Biener  writes:
> On Thu, May 4, 2017 at 7:21 PM, Richard Sandiford
>  wrote:
>> Richard Biener  writes:
>>> On Thu, May 4, 2017 at 2:12 PM, Richard Biener
>>>  wrote:
 On Wed, May 3, 2017 at 10:00 AM, Richard Sandiford
  wrote:
> This patch tries to calculate conservatively-correct distance
> vectors for two references whose base addresses are not the same.
> It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence
> isn't guaranteed to occur.
>
> The motivating example is:
>
>   struct s { int x[8]; };
>   void
>   f (struct s *a, struct s *b)
>   {
> for (int i = 0; i < 8; ++i)
>   a->x[i] += b->x[i];
>   }
>
> in which the "a" and "b" accesses are either independent or have a
> dependence distance of 0 (assuming -fstrict-aliasing).  Neither case
> prevents vectorisation, so we can vectorise without an alias check.
>
> I'd originally wanted to do the same thing for arrays as well, e.g.:
>
>   void
>   f (int a[][8], struct b[][8])
>   {
> for (int i = 0; i < 8; ++i)
>   a[0][i] += b[0][i];
>   }
>
> I think this is valid because C11 6.7.6.2/6 says:
>
>   For two array types to be compatible, both shall have compatible
>   element types, and if both size specifiers are present, and are
>   integer constant expressions, then both size specifiers shall have
>   the same constant value.
>
> So if we access an array through an int (*)[8], it must have type X[8]
> or X[], where X is compatible with int.  It doesn't seem possible in
> either case for "a[0]" and "b[0]" to overlap when "a != b".
>
> However, Richard B said that (at least in gimple) we support arbitrary
> overlap of arrays and allow arrays to be accessed with different
> dimensionality.  There are examples of this in PR50067.  I've therefore
> only handled references that end in a structure field access.
>
> There are two ways of handling these dependences in the vectoriser:
> use them to limit VF, or check at runtime as before.  I've gone for
> the approach of checking at runtime if we can, to avoid limiting VF
> unnecessarily.  We still fall back to a VF cap when runtime checks
> aren't allowed.
>
> The patch tests whether we queued an alias check with a dependence
> distance of X and then picked a VF <= X, in which case it's safe to
> drop the alias check.  Since vect_prune_runtime_alias_check_list can
> be called twice with different VF for the same loop, it's no longer
> safe to clear may_alias_ddrs on exit.  Instead we should use
> comp_alias_ddrs to check whether versioning is necessary.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

 You seem to do your "fancy" thing but also later compute the old
 base equality anyway (for same_base_p).  It looks to me for this
 case the new fancy code can be simply skipped, keeping num_dimensions
 as before?

 +  /* Try to approach equal type sizes.  */
 +  if (!COMPLETE_TYPE_P (type_a)
 + || !COMPLETE_TYPE_P (type_b)
 + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
 + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
 +   break;

 ah, interesting idea to avoid a quadratic search.  Note that you should
 conservatively handle both BIT_FIELD_REF and VIEW_CONVERT_EXPR
 as they are used for type-punning.
>>
>> All the component refs here should be REALPART_EXPRs, IMAGPART_EXPRs,
>> ARRAY_REFs or COMPONENT_REFs of structures, since that's all that
>> dr_analyze_indices allows, so I think we safe in terms of the tree codes.
>
> Yeah.  I think we need to document that we should have a 1:1 match here.

OK, I added that to the comments and also added an access_fn_component_p
that we can assert on.

 I see nonoverlapping_component_refs_of_decl_p should simply skip
 ARRAY_REFs - but I also see there:

   /* ??? We cannot simply use the type of operand #0 of the refs here
  as the Fortran compiler smuggles type punning into COMPONENT_REFs
  for common blocks instead of using unions like everyone else.  */
   tree type1 = DECL_CONTEXT (field1);
   tree type2 = DECL_CONTEXT (field2);

 so you probably can't simply use TREE_TYPE (outer_ref) for type 
 compatibility.
 You also may not use types_compatible_p here as for LTO that is _way_ too
 lax for aggregates.  The above uses

   /* We cannot disambiguate fields in a union or qualified union.  */
   if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
  return false;

 so you should also bail out on unions here, rather 

Re: Handle data dependence relations with different bases

2017-05-09 Thread Richard Biener
On Fri, May 5, 2017 at 11:27 PM, Bernhard Reutner-Fischer
 wrote:
> On 4 May 2017 14:12:04 CEST, Richard Biener  
> wrote:
>
>>nonoverlapping_component_refs_of_decl_p
>>should simply skip ARRAY_REFs - but I also see there:
>>
>>/* ??? We cannot simply use the type of operand #0 of the refs here
>>  as the Fortran compiler smuggles type punning into COMPONENT_REFs
>>  for common blocks instead of using unions like everyone else.  */
>>  tree type1 = DECL_CONTEXT (field1);
>>  tree type2 = DECL_CONTEXT (field2);
>>
>>so you probably can't simply use TREE_TYPE (outer_ref) for type
>>compatibility.
>>You also may not use types_compatible_p here as for LTO that is _way_
>>too
>>lax for aggregates.  The above uses
>>
>>/* We cannot disambiguate fields in a union or qualified union.  */
>>  if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
>> return false;
>>
>>so you should also bail out on unions here, rather than the check you
>>do later.
>>
>>You seem to rely on getting an access_fn entry for each
>>handled_component_p.
>>It looks like this is the case -- we even seem to stop at unions (with
>>the same
>>fortran "issue").  I'm not sure that's the best thing to do but you
>>rely on that.
>
> Is there a PR for the (IIUC) common as union?
> Maybe around
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41227
> COMMON block, BIND(C) and LTO interoperability issues

I'm not sure, this is Erics code so maybe he remembers.

Richard.

> Thanks


Re: Handle data dependence relations with different bases

2017-05-09 Thread Richard Biener
On Thu, May 4, 2017 at 7:21 PM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Thu, May 4, 2017 at 2:12 PM, Richard Biener
>>  wrote:
>>> On Wed, May 3, 2017 at 10:00 AM, Richard Sandiford
>>>  wrote:
 This patch tries to calculate conservatively-correct distance
 vectors for two references whose base addresses are not the same.
 It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence
 isn't guaranteed to occur.

 The motivating example is:

   struct s { int x[8]; };
   void
   f (struct s *a, struct s *b)
   {
 for (int i = 0; i < 8; ++i)
   a->x[i] += b->x[i];
   }

 in which the "a" and "b" accesses are either independent or have a
 dependence distance of 0 (assuming -fstrict-aliasing).  Neither case
 prevents vectorisation, so we can vectorise without an alias check.

 I'd originally wanted to do the same thing for arrays as well, e.g.:

   void
   f (int a[][8], struct b[][8])
   {
 for (int i = 0; i < 8; ++i)
   a[0][i] += b[0][i];
   }

 I think this is valid because C11 6.7.6.2/6 says:

   For two array types to be compatible, both shall have compatible
   element types, and if both size specifiers are present, and are
   integer constant expressions, then both size specifiers shall have
   the same constant value.

 So if we access an array through an int (*)[8], it must have type X[8]
 or X[], where X is compatible with int.  It doesn't seem possible in
 either case for "a[0]" and "b[0]" to overlap when "a != b".

 However, Richard B said that (at least in gimple) we support arbitrary
 overlap of arrays and allow arrays to be accessed with different
 dimensionality.  There are examples of this in PR50067.  I've therefore
 only handled references that end in a structure field access.

 There are two ways of handling these dependences in the vectoriser:
 use them to limit VF, or check at runtime as before.  I've gone for
 the approach of checking at runtime if we can, to avoid limiting VF
 unnecessarily.  We still fall back to a VF cap when runtime checks
 aren't allowed.

 The patch tests whether we queued an alias check with a dependence
 distance of X and then picked a VF <= X, in which case it's safe to
 drop the alias check.  Since vect_prune_runtime_alias_check_list can
 be called twice with different VF for the same loop, it's no longer
 safe to clear may_alias_ddrs on exit.  Instead we should use
 comp_alias_ddrs to check whether versioning is necessary.

 Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>>
>>> You seem to do your "fancy" thing but also later compute the old
>>> base equality anyway (for same_base_p).  It looks to me for this
>>> case the new fancy code can be simply skipped, keeping num_dimensions
>>> as before?
>>>
>>> +  /* Try to approach equal type sizes.  */
>>> +  if (!COMPLETE_TYPE_P (type_a)
>>> + || !COMPLETE_TYPE_P (type_b)
>>> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
>>> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
>>> +   break;
>>>
>>> ah, interesting idea to avoid a quadratic search.  Note that you should
>>> conservatively handle both BIT_FIELD_REF and VIEW_CONVERT_EXPR
>>> as they are used for type-punning.
>
> All the component refs here should be REALPART_EXPRs, IMAGPART_EXPRs,
> ARRAY_REFs or COMPONENT_REFs of structures, since that's all that
> dr_analyze_indices allows, so I think we safe in terms of the tree codes.

Yeah.  I think we need to document that we should have a 1:1 match here.

>>> I see nonoverlapping_component_refs_of_decl_p should simply skip
>>> ARRAY_REFs - but I also see there:
>>>
>>>   /* ??? We cannot simply use the type of operand #0 of the refs here
>>>  as the Fortran compiler smuggles type punning into COMPONENT_REFs
>>>  for common blocks instead of using unions like everyone else.  */
>>>   tree type1 = DECL_CONTEXT (field1);
>>>   tree type2 = DECL_CONTEXT (field2);
>>>
>>> so you probably can't simply use TREE_TYPE (outer_ref) for type 
>>> compatibility.
>>> You also may not use types_compatible_p here as for LTO that is _way_ too
>>> lax for aggregates.  The above uses
>>>
>>>   /* We cannot disambiguate fields in a union or qualified union.  */
>>>   if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
>>>  return false;
>>>
>>> so you should also bail out on unions here, rather than the check you do 
>>> later.
>
> The loop stops before we get to a union, so I think "only" the RECORD_TYPE
> COMPONENT_REF handling is a potential problem.  Does this mean that
> I should use the nonoverlapping_component_refs_of_decl_p code:
>
>   

Re: Handle data dependence relations with different bases

2017-05-05 Thread Bernhard Reutner-Fischer
On 4 May 2017 14:12:04 CEST, Richard Biener  wrote:

>nonoverlapping_component_refs_of_decl_p
>should simply skip ARRAY_REFs - but I also see there:
>
>/* ??? We cannot simply use the type of operand #0 of the refs here
>  as the Fortran compiler smuggles type punning into COMPONENT_REFs
>  for common blocks instead of using unions like everyone else.  */
>  tree type1 = DECL_CONTEXT (field1);
>  tree type2 = DECL_CONTEXT (field2);
>
>so you probably can't simply use TREE_TYPE (outer_ref) for type
>compatibility.
>You also may not use types_compatible_p here as for LTO that is _way_
>too
>lax for aggregates.  The above uses
>
>/* We cannot disambiguate fields in a union or qualified union.  */
>  if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
> return false;
>
>so you should also bail out on unions here, rather than the check you
>do later.
>
>You seem to rely on getting an access_fn entry for each
>handled_component_p.
>It looks like this is the case -- we even seem to stop at unions (with
>the same
>fortran "issue").  I'm not sure that's the best thing to do but you
>rely on that.

Is there a PR for the (IIUC) common as union?
Maybe around
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41227
COMMON block, BIND(C) and LTO interoperability issues

Thanks


Re: Handle data dependence relations with different bases

2017-05-04 Thread Richard Sandiford
Richard Biener  writes:
> On Thu, May 4, 2017 at 2:12 PM, Richard Biener
>  wrote:
>> On Wed, May 3, 2017 at 10:00 AM, Richard Sandiford
>>  wrote:
>>> This patch tries to calculate conservatively-correct distance
>>> vectors for two references whose base addresses are not the same.
>>> It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence
>>> isn't guaranteed to occur.
>>>
>>> The motivating example is:
>>>
>>>   struct s { int x[8]; };
>>>   void
>>>   f (struct s *a, struct s *b)
>>>   {
>>> for (int i = 0; i < 8; ++i)
>>>   a->x[i] += b->x[i];
>>>   }
>>>
>>> in which the "a" and "b" accesses are either independent or have a
>>> dependence distance of 0 (assuming -fstrict-aliasing).  Neither case
>>> prevents vectorisation, so we can vectorise without an alias check.
>>>
>>> I'd originally wanted to do the same thing for arrays as well, e.g.:
>>>
>>>   void
>>>   f (int a[][8], struct b[][8])
>>>   {
>>> for (int i = 0; i < 8; ++i)
>>>   a[0][i] += b[0][i];
>>>   }
>>>
>>> I think this is valid because C11 6.7.6.2/6 says:
>>>
>>>   For two array types to be compatible, both shall have compatible
>>>   element types, and if both size specifiers are present, and are
>>>   integer constant expressions, then both size specifiers shall have
>>>   the same constant value.
>>>
>>> So if we access an array through an int (*)[8], it must have type X[8]
>>> or X[], where X is compatible with int.  It doesn't seem possible in
>>> either case for "a[0]" and "b[0]" to overlap when "a != b".
>>>
>>> However, Richard B said that (at least in gimple) we support arbitrary
>>> overlap of arrays and allow arrays to be accessed with different
>>> dimensionality.  There are examples of this in PR50067.  I've therefore
>>> only handled references that end in a structure field access.
>>>
>>> There are two ways of handling these dependences in the vectoriser:
>>> use them to limit VF, or check at runtime as before.  I've gone for
>>> the approach of checking at runtime if we can, to avoid limiting VF
>>> unnecessarily.  We still fall back to a VF cap when runtime checks
>>> aren't allowed.
>>>
>>> The patch tests whether we queued an alias check with a dependence
>>> distance of X and then picked a VF <= X, in which case it's safe to
>>> drop the alias check.  Since vect_prune_runtime_alias_check_list can
>>> be called twice with different VF for the same loop, it's no longer
>>> safe to clear may_alias_ddrs on exit.  Instead we should use
>>> comp_alias_ddrs to check whether versioning is necessary.
>>>
>>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> You seem to do your "fancy" thing but also later compute the old
>> base equality anyway (for same_base_p).  It looks to me for this
>> case the new fancy code can be simply skipped, keeping num_dimensions
>> as before?
>>
>> +  /* Try to approach equal type sizes.  */
>> +  if (!COMPLETE_TYPE_P (type_a)
>> + || !COMPLETE_TYPE_P (type_b)
>> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
>> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
>> +   break;
>>
>> ah, interesting idea to avoid a quadratic search.  Note that you should
>> conservatively handle both BIT_FIELD_REF and VIEW_CONVERT_EXPR
>> as they are used for type-punning.

All the component refs here should be REALPART_EXPRs, IMAGPART_EXPRs,
ARRAY_REFs or COMPONENT_REFs of structures, since that's all that
dr_analyze_indices allows, so I think we safe in terms of the tree codes.

>> I see nonoverlapping_component_refs_of_decl_p should simply skip
>> ARRAY_REFs - but I also see there:
>>
>>   /* ??? We cannot simply use the type of operand #0 of the refs here
>>  as the Fortran compiler smuggles type punning into COMPONENT_REFs
>>  for common blocks instead of using unions like everyone else.  */
>>   tree type1 = DECL_CONTEXT (field1);
>>   tree type2 = DECL_CONTEXT (field2);
>>
>> so you probably can't simply use TREE_TYPE (outer_ref) for type 
>> compatibility.
>> You also may not use types_compatible_p here as for LTO that is _way_ too
>> lax for aggregates.  The above uses
>>
>>   /* We cannot disambiguate fields in a union or qualified union.  */
>>   if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
>>  return false;
>>
>> so you should also bail out on unions here, rather than the check you do 
>> later.

The loop stops before we get to a union, so I think "only" the RECORD_TYPE
COMPONENT_REF handling is a potential problem.  Does this mean that
I should use the nonoverlapping_component_refs_of_decl_p code:

  tree field1 = TREE_OPERAND (ref1, 1);
  tree field2 = TREE_OPERAND (ref2, 1);

  /* ??? We cannot simply use the type of operand #0 of the refs here
 as the Fortran compiler smuggles type punning into COMPONENT_REFs
 for common blocks instead of using 

Re: Handle data dependence relations with different bases

2017-05-04 Thread Richard Biener
On Thu, May 4, 2017 at 2:12 PM, Richard Biener
 wrote:
> On Wed, May 3, 2017 at 10:00 AM, Richard Sandiford
>  wrote:
>> This patch tries to calculate conservatively-correct distance
>> vectors for two references whose base addresses are not the same.
>> It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence
>> isn't guaranteed to occur.
>>
>> The motivating example is:
>>
>>   struct s { int x[8]; };
>>   void
>>   f (struct s *a, struct s *b)
>>   {
>> for (int i = 0; i < 8; ++i)
>>   a->x[i] += b->x[i];
>>   }
>>
>> in which the "a" and "b" accesses are either independent or have a
>> dependence distance of 0 (assuming -fstrict-aliasing).  Neither case
>> prevents vectorisation, so we can vectorise without an alias check.
>>
>> I'd originally wanted to do the same thing for arrays as well, e.g.:
>>
>>   void
>>   f (int a[][8], struct b[][8])
>>   {
>> for (int i = 0; i < 8; ++i)
>>   a[0][i] += b[0][i];
>>   }
>>
>> I think this is valid because C11 6.7.6.2/6 says:
>>
>>   For two array types to be compatible, both shall have compatible
>>   element types, and if both size specifiers are present, and are
>>   integer constant expressions, then both size specifiers shall have
>>   the same constant value.
>>
>> So if we access an array through an int (*)[8], it must have type X[8]
>> or X[], where X is compatible with int.  It doesn't seem possible in
>> either case for "a[0]" and "b[0]" to overlap when "a != b".
>>
>> However, Richard B said that (at least in gimple) we support arbitrary
>> overlap of arrays and allow arrays to be accessed with different
>> dimensionality.  There are examples of this in PR50067.  I've therefore
>> only handled references that end in a structure field access.
>>
>> There are two ways of handling these dependences in the vectoriser:
>> use them to limit VF, or check at runtime as before.  I've gone for
>> the approach of checking at runtime if we can, to avoid limiting VF
>> unnecessarily.  We still fall back to a VF cap when runtime checks
>> aren't allowed.
>>
>> The patch tests whether we queued an alias check with a dependence
>> distance of X and then picked a VF <= X, in which case it's safe to
>> drop the alias check.  Since vect_prune_runtime_alias_check_list can
>> be called twice with different VF for the same loop, it's no longer
>> safe to clear may_alias_ddrs on exit.  Instead we should use
>> comp_alias_ddrs to check whether versioning is necessary.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> You seem to do your "fancy" thing but also later compute the old
> base equality anyway (for same_base_p).  It looks to me for this
> case the new fancy code can be simply skipped, keeping num_dimensions
> as before?
>
> +  /* Try to approach equal type sizes.  */
> +  if (!COMPLETE_TYPE_P (type_a)
> + || !COMPLETE_TYPE_P (type_b)
> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
> + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
> +   break;
>
> ah, interesting idea to avoid a quadratic search.  Note that you should
> conservatively handle both BIT_FIELD_REF and VIEW_CONVERT_EXPR
> as they are used for type-punning.  I see
> nonoverlapping_component_refs_of_decl_p
> should simply skip ARRAY_REFs - but I also see there:
>
>   /* ??? We cannot simply use the type of operand #0 of the refs here
>  as the Fortran compiler smuggles type punning into COMPONENT_REFs
>  for common blocks instead of using unions like everyone else.  */
>   tree type1 = DECL_CONTEXT (field1);
>   tree type2 = DECL_CONTEXT (field2);
>
> so you probably can't simply use TREE_TYPE (outer_ref) for type compatibility.
> You also may not use types_compatible_p here as for LTO that is _way_ too
> lax for aggregates.  The above uses
>
>   /* We cannot disambiguate fields in a union or qualified union.  */
>   if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
>  return false;
>
> so you should also bail out on unions here, rather than the check you do 
> later.
>
> You seem to rely on getting an access_fn entry for each handled_component_p.
> It looks like this is the case -- we even seem to stop at unions (with the 
> same
> fortran "issue").  I'm not sure that's the best thing to do but you
> rely on that.
>
> I don't understand the looping, it needs more comments.  You seem to be
> looking for the innermost compatible RECORD_TYPE but then num_dimensions
> is how many compatible refs you found on the way (with incompatible ones
> not counting?!).  What about an inner varying array of structs?  This seems to
> be disregarded in the analysis now?  Thus, a[i].s.b[i].j vs. __real
> b[i].s.b[i].j?
>
> nonoverlapping_component_refs_of_decl_p/nonoverlapping_component_refs_p
> conveniently start from the other
> end of the ref here.

That said, for the motivational cases we either have one ref having
more 

Re: Handle data dependence relations with different bases

2017-05-04 Thread Richard Biener
On Wed, May 3, 2017 at 10:00 AM, Richard Sandiford
 wrote:
> This patch tries to calculate conservatively-correct distance
> vectors for two references whose base addresses are not the same.
> It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence
> isn't guaranteed to occur.
>
> The motivating example is:
>
>   struct s { int x[8]; };
>   void
>   f (struct s *a, struct s *b)
>   {
> for (int i = 0; i < 8; ++i)
>   a->x[i] += b->x[i];
>   }
>
> in which the "a" and "b" accesses are either independent or have a
> dependence distance of 0 (assuming -fstrict-aliasing).  Neither case
> prevents vectorisation, so we can vectorise without an alias check.
>
> I'd originally wanted to do the same thing for arrays as well, e.g.:
>
>   void
>   f (int a[][8], struct b[][8])
>   {
> for (int i = 0; i < 8; ++i)
>   a[0][i] += b[0][i];
>   }
>
> I think this is valid because C11 6.7.6.2/6 says:
>
>   For two array types to be compatible, both shall have compatible
>   element types, and if both size specifiers are present, and are
>   integer constant expressions, then both size specifiers shall have
>   the same constant value.
>
> So if we access an array through an int (*)[8], it must have type X[8]
> or X[], where X is compatible with int.  It doesn't seem possible in
> either case for "a[0]" and "b[0]" to overlap when "a != b".
>
> However, Richard B said that (at least in gimple) we support arbitrary
> overlap of arrays and allow arrays to be accessed with different
> dimensionality.  There are examples of this in PR50067.  I've therefore
> only handled references that end in a structure field access.
>
> There are two ways of handling these dependences in the vectoriser:
> use them to limit VF, or check at runtime as before.  I've gone for
> the approach of checking at runtime if we can, to avoid limiting VF
> unnecessarily.  We still fall back to a VF cap when runtime checks
> aren't allowed.
>
> The patch tests whether we queued an alias check with a dependence
> distance of X and then picked a VF <= X, in which case it's safe to
> drop the alias check.  Since vect_prune_runtime_alias_check_list can
> be called twice with different VF for the same loop, it's no longer
> safe to clear may_alias_ddrs on exit.  Instead we should use
> comp_alias_ddrs to check whether versioning is necessary.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

You seem to do your "fancy" thing but also later compute the old
base equality anyway (for same_base_p).  It looks to me for this
case the new fancy code can be simply skipped, keeping num_dimensions
as before?

+  /* Try to approach equal type sizes.  */
+  if (!COMPLETE_TYPE_P (type_a)
+ || !COMPLETE_TYPE_P (type_b)
+ || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
+ || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
+   break;

ah, interesting idea to avoid a quadratic search.  Note that you should
conservatively handle both BIT_FIELD_REF and VIEW_CONVERT_EXPR
as they are used for type-punning.  I see
nonoverlapping_component_refs_of_decl_p
should simply skip ARRAY_REFs - but I also see there:

  /* ??? We cannot simply use the type of operand #0 of the refs here
 as the Fortran compiler smuggles type punning into COMPONENT_REFs
 for common blocks instead of using unions like everyone else.  */
  tree type1 = DECL_CONTEXT (field1);
  tree type2 = DECL_CONTEXT (field2);

so you probably can't simply use TREE_TYPE (outer_ref) for type compatibility.
You also may not use types_compatible_p here as for LTO that is _way_ too
lax for aggregates.  The above uses

  /* We cannot disambiguate fields in a union or qualified union.  */
  if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
 return false;

so you should also bail out on unions here, rather than the check you do later.

You seem to rely on getting an access_fn entry for each handled_component_p.
It looks like this is the case -- we even seem to stop at unions (with the same
fortran "issue").  I'm not sure that's the best thing to do but you
rely on that.

I don't understand the looping, it needs more comments.  You seem to be
looking for the innermost compatible RECORD_TYPE but then num_dimensions
is how many compatible refs you found on the way (with incompatible ones
not counting?!).  What about an inner varying array of structs?  This seems to
be disregarded in the analysis now?  Thus, a[i].s.b[i].j vs. __real
b[i].s.b[i].j?

nonoverlapping_component_refs_of_decl_p/nonoverlapping_component_refs_p
conveniently start from the other
end of the ref here.

Richard.

> Thanks,
> Richard
>
>
> gcc/
> 2017-05-03  Richard Sandiford  
>
> * tree-data-ref.h (subscript): Add access_fn field.
> (data_dependence_relation): Add could_be_independent_p.
> (SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
> 

Re: Handle data dependence relations with different bases

2017-05-04 Thread Richard Sandiford
"Bin.Cheng"  writes:
> On Thu, May 4, 2017 at 11:06 AM, Richard Sandiford
>  wrote:
>> "Bin.Cheng"  writes:
>>> On Wed, May 3, 2017 at 9:00 AM, Richard Sandiford
>>>  wrote:
 Index: gcc/tree-data-ref.c
 ===
 --- gcc/tree-data-ref.c 2017-02-23 19:54:15.0 +
 +++ gcc/tree-data-ref.c 2017-05-03 08:48:48.737038502 +0100
 @@ -123,8 +123,7 @@ Software Foundation; either version 3, o
  } dependence_stats;

 static bool subscript_dependence_tester_1 (struct
> data_dependence_relation *,
 -  struct data_reference *,
 -  struct data_reference *,
 +  unsigned int, unsigned int,
struct loop *);
>>> As mentioned, how about passing access_fn directly, rather than less
>>> meaningful 0/1 values?
>>
>> The problem is that access_fn is a property of the individual
>> subscripts, whereas this is operating on a full data_reference.
>>
>> One alternative would be to use conditions like:
>>
>>   first_is_a ? SUB_ACCESS_FN_A (sub) : SUB_ACCESS_FN_B (sub)
>>
>> but IMO that's less readable than the existing:
>>
>>   SUB_ACCESS_FN (sub, index)
>>
>> Or we could have individual access_fn arrays for A and B, separate
>> from the main subscript array, but that would mean allocating three
>> arrays instead of one.
> Thanks for explanation, I see the problem now.  Even the latter
> sequence could be different for A and B, there should have the same
> number index?  If that's the case, is it possible just recording the
> starting position (or length) in DR_ACCESS_FN and use that information
> to access to access_fn vector.  This can save the copy in subscript.
> Anyway, this is not am important problem.

I think that would mean trading fields in the subscript for fields
in the main ddr structure.  One advantage of doing it in the subscript
is that those are freed after the analysis is complete, whereas the
ddr stays around until the caller has finished with it.

 + latter sequence.  */
 +  unsigned int start_a = 0;
 +  unsigned int start_b = 0;
 +  unsigned int num_dimensions = 0;
 +  unsigned int struct_start_a = 0;
 +  unsigned int struct_start_b = 0;
 +  unsigned int struct_num_dimensions = 0;
 +  unsigned int index_a = 0;
 +  unsigned int index_b = 0;
 +  tree next_ref_a = DR_REF (a);
 +  tree next_ref_b = DR_REF (b);
 +  tree struct_ref_a = NULL_TREE;
 +  tree struct_ref_b = NULL_TREE;
 +  while (index_a < num_dimensions_a && index_b < num_dimensions_b)
 +{
 +  gcc_checking_assert (handled_component_p (next_ref_a));
 +  gcc_checking_assert (handled_component_p (next_ref_b));
 +  tree outer_ref_a = TREE_OPERAND (next_ref_a, 0);
 +  tree outer_ref_b = TREE_OPERAND (next_ref_b, 0);
 +  tree type_a = TREE_TYPE (outer_ref_a);
 +  tree type_b = TREE_TYPE (outer_ref_b);
 +  if (types_compatible_p (type_a, type_b))
 +   {
 + /* This pair of accesses belong to a suitable sequence.  */
 + if (start_a + num_dimensions != index_a
 + || start_b + num_dimensions != index_b)
 +   {
 + /* Start a new sequence here.  */
 + start_a = index_a;
 + start_b = index_b;
 + num_dimensions = 0;
 +   }
 + num_dimensions += 1;
 + if (TREE_CODE (type_a) == RECORD_TYPE)
 +   {
 + struct_start_a = start_a;
 + struct_start_b = start_b;
 + struct_num_dimensions = num_dimensions;
 + struct_ref_a = outer_ref_a;
 + struct_ref_b = outer_ref_b;
 +   }
 + next_ref_a = outer_ref_a;
 + next_ref_b = outer_ref_b;
 + index_a += 1;
 + index_b += 1;
 + continue;
 +   }
 +  /* Try to approach equal type sizes.  */
 +  if (!COMPLETE_TYPE_P (type_a)
 + || !COMPLETE_TYPE_P (type_b)
 + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
 + || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
 +   break;
 + unsigned HOST_WIDE_INT size_a = tree_to_uhwi (TYPE_SIZE_UNIT
> (type_a));
 + unsigned HOST_WIDE_INT size_b = tree_to_uhwi (TYPE_SIZE_UNIT
> (type_b));
 +  if (size_a <= size_b)
 +   {
 + index_a += 1;
 + next_ref_a = outer_ref_a;
 +   }
 +  if (size_b <= size_a)
 +   {
 + index_b += 1;
 + next_ref_b = outer_ref_b;
 +   }
  }

 -  /* If the references do not access the same 

Re: Handle data dependence relations with different bases

2017-05-04 Thread Bin.Cheng
On Thu, May 4, 2017 at 11:06 AM, Richard Sandiford
 wrote:
> "Bin.Cheng"  writes:
>> On Wed, May 3, 2017 at 9:00 AM, Richard Sandiford
>>  wrote:
>>> Index: gcc/tree-data-ref.h
>>> ===
>>> --- gcc/tree-data-ref.h 2017-05-03 08:48:11.977015306 +0100
>>> +++ gcc/tree-data-ref.h 2017-05-03 08:48:48.737038502 +0100
>>> @@ -191,6 +191,9 @@ struct conflict_function
>>>
>>>  struct subscript
>>>  {
>>> +  /* The access functions of the two references.  */
>>> +  tree access_fn[2];
>> Is it better to follow existing code, i.e, name this as
>> access_fn_a/access_fn_b.  Thus we don't need to use const value 0/1 in
>> various places, which is a little bit confusing.
>
> [Answered below]
>
>>> +
>>>/* A description of the iterations for which the elements are
>>>   accessed twice.  */
>>>conflict_function *conflicting_iterations_in_a;
>>> @@ -209,6 +212,7 @@ struct subscript
>>>
>>>  typedef struct subscript *subscript_p;
>>>
>>> +#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
>>>  #define SUB_CONFLICTS_IN_A(SUB) (SUB)->conflicting_iterations_in_a
>>>  #define SUB_CONFLICTS_IN_B(SUB) (SUB)->conflicting_iterations_in_b
>>>  #define SUB_LAST_CONFLICT(SUB) (SUB)->last_conflict
>>> @@ -264,6 +268,33 @@ struct data_dependence_relation
>>>/* Set to true when the dependence relation is on the same data
>>>   access.  */
>>>bool self_reference_p;
>>> +
>>> +  /* True if the dependence described is conservatively correct rather
>>> + than exact, and if it is still possible for the accesses to be
>>> + conditionally independent.  For example, the a and b references in:
>>> +
>>> +   struct s *a, *b;
>>> +   for (int i = 0; i < n; ++i)
>>> + a->f[i] += b->f[i];
>>> +
>>> + conservatively have a distance vector of (0), for the case in which
>>> + a == b, but the accesses are independent if a != b.  Similarly,
>>> + the a and b references in:
>>> +
>>> +   struct s *a, *b;
>>> +   for (int i = 0; i < n; ++i)
>>> + a[0].f[i] += b[i].f[i];
>>> +
>>> + conservatively have a distance vector of (0), but they are indepenent
>>> + when a != b + i.  In contrast, the references in:
>>> +
>>> +   struct s *a;
>>> +   for (int i = 0; i < n; ++i)
>>> + a->f[i] += a->f[i];
>>> +
>>> + have the same distance vector of (0), but the accesses can never be
>>> + independent.  */
>>> +  bool could_be_independent_p;
>>>  };
>>>
>>>  typedef struct data_dependence_relation *ddr_p;
>>> @@ -294,6 +325,7 @@ #define DDR_DIR_VECT(DDR, I) \
>>>  #define DDR_DIST_VECT(DDR, I) \
>>>DDR_DIST_VECTS (DDR)[I]
>>>  #define DDR_REVERSED_P(DDR) (DDR)->reversed_p
>>> +#define DDR_COULD_BE_INDEPENDENT_P(DDR) (DDR)->could_be_independent_p
>>>
>>>
>>>  bool dr_analyze_innermost (struct data_reference *, struct loop *);
>>> @@ -372,22 +404,6 @@ same_data_refs (data_reference_p a, data
>>>return false;
>>>
>>>return true;
>>> -}
>>> -
>>> -/* Return true when the DDR contains two data references that have the
>>> -   same access functions.  */
>>> -
>>> -static inline bool
>>> -same_access_functions (const struct data_dependence_relation *ddr)
>>> -{
>>> -  unsigned i;
>>> -
>>> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>>> -if (!eq_evolutions_p (DR_ACCESS_FN (DDR_A (ddr), i),
>>> - DR_ACCESS_FN (DDR_B (ddr), i)))
>>> -  return false;
>>> -
>>> -  return true;
>>>  }
>>>
>>>  /* Returns true when all the dependences are computable.  */
>>> Index: gcc/tree-data-ref.c
>>> ===
>>> --- gcc/tree-data-ref.c 2017-02-23 19:54:15.0 +
>>> +++ gcc/tree-data-ref.c 2017-05-03 08:48:48.737038502 +0100
>>> @@ -123,8 +123,7 @@ Software Foundation; either version 3, o
>>>  } dependence_stats;
>>>
>>>  static bool subscript_dependence_tester_1 (struct data_dependence_relation 
>>> *,
>>> -  struct data_reference *,
>>> -  struct data_reference *,
>>> +  unsigned int, unsigned int,
>>>struct loop *);
>> As mentioned, how about passing access_fn directly, rather than less
>> meaningful 0/1 values?
>
> The problem is that access_fn is a property of the individual
> subscripts, whereas this is operating on a full data_reference.
>
> One alternative would be to use conditions like:
>
>   first_is_a ? SUB_ACCESS_FN_A (sub) : SUB_ACCESS_FN_B (sub)
>
> but IMO that's less readable than the existing:
>
>   SUB_ACCESS_FN (sub, index)
>
> Or we could have individual access_fn arrays for A and B, separate
> from the main subscript array, but that would mean allocating three
> arrays instead of one.
Thanks for explanation, I see the problem now.  Even the 

Re: Handle data dependence relations with different bases

2017-05-04 Thread Richard Sandiford
"Bin.Cheng"  writes:
> On Wed, May 3, 2017 at 9:00 AM, Richard Sandiford
>  wrote:
>> Index: gcc/tree-data-ref.h
>> ===
>> --- gcc/tree-data-ref.h 2017-05-03 08:48:11.977015306 +0100
>> +++ gcc/tree-data-ref.h 2017-05-03 08:48:48.737038502 +0100
>> @@ -191,6 +191,9 @@ struct conflict_function
>>
>>  struct subscript
>>  {
>> +  /* The access functions of the two references.  */
>> +  tree access_fn[2];
> Is it better to follow existing code, i.e, name this as
> access_fn_a/access_fn_b.  Thus we don't need to use const value 0/1 in
> various places, which is a little bit confusing.

[Answered below]

>> +
>>/* A description of the iterations for which the elements are
>>   accessed twice.  */
>>conflict_function *conflicting_iterations_in_a;
>> @@ -209,6 +212,7 @@ struct subscript
>>
>>  typedef struct subscript *subscript_p;
>>
>> +#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
>>  #define SUB_CONFLICTS_IN_A(SUB) (SUB)->conflicting_iterations_in_a
>>  #define SUB_CONFLICTS_IN_B(SUB) (SUB)->conflicting_iterations_in_b
>>  #define SUB_LAST_CONFLICT(SUB) (SUB)->last_conflict
>> @@ -264,6 +268,33 @@ struct data_dependence_relation
>>/* Set to true when the dependence relation is on the same data
>>   access.  */
>>bool self_reference_p;
>> +
>> +  /* True if the dependence described is conservatively correct rather
>> + than exact, and if it is still possible for the accesses to be
>> + conditionally independent.  For example, the a and b references in:
>> +
>> +   struct s *a, *b;
>> +   for (int i = 0; i < n; ++i)
>> + a->f[i] += b->f[i];
>> +
>> + conservatively have a distance vector of (0), for the case in which
>> + a == b, but the accesses are independent if a != b.  Similarly,
>> + the a and b references in:
>> +
>> +   struct s *a, *b;
>> +   for (int i = 0; i < n; ++i)
>> + a[0].f[i] += b[i].f[i];
>> +
>> + conservatively have a distance vector of (0), but they are indepenent
>> + when a != b + i.  In contrast, the references in:
>> +
>> +   struct s *a;
>> +   for (int i = 0; i < n; ++i)
>> + a->f[i] += a->f[i];
>> +
>> + have the same distance vector of (0), but the accesses can never be
>> + independent.  */
>> +  bool could_be_independent_p;
>>  };
>>
>>  typedef struct data_dependence_relation *ddr_p;
>> @@ -294,6 +325,7 @@ #define DDR_DIR_VECT(DDR, I) \
>>  #define DDR_DIST_VECT(DDR, I) \
>>DDR_DIST_VECTS (DDR)[I]
>>  #define DDR_REVERSED_P(DDR) (DDR)->reversed_p
>> +#define DDR_COULD_BE_INDEPENDENT_P(DDR) (DDR)->could_be_independent_p
>>
>>
>>  bool dr_analyze_innermost (struct data_reference *, struct loop *);
>> @@ -372,22 +404,6 @@ same_data_refs (data_reference_p a, data
>>return false;
>>
>>return true;
>> -}
>> -
>> -/* Return true when the DDR contains two data references that have the
>> -   same access functions.  */
>> -
>> -static inline bool
>> -same_access_functions (const struct data_dependence_relation *ddr)
>> -{
>> -  unsigned i;
>> -
>> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>> -if (!eq_evolutions_p (DR_ACCESS_FN (DDR_A (ddr), i),
>> - DR_ACCESS_FN (DDR_B (ddr), i)))
>> -  return false;
>> -
>> -  return true;
>>  }
>>
>>  /* Returns true when all the dependences are computable.  */
>> Index: gcc/tree-data-ref.c
>> ===
>> --- gcc/tree-data-ref.c 2017-02-23 19:54:15.0 +
>> +++ gcc/tree-data-ref.c 2017-05-03 08:48:48.737038502 +0100
>> @@ -123,8 +123,7 @@ Software Foundation; either version 3, o
>>  } dependence_stats;
>>
>>  static bool subscript_dependence_tester_1 (struct data_dependence_relation 
>> *,
>> -  struct data_reference *,
>> -  struct data_reference *,
>> +  unsigned int, unsigned int,
>>struct loop *);
> As mentioned, how about passing access_fn directly, rather than less
> meaningful 0/1 values?

The problem is that access_fn is a property of the individual
subscripts, whereas this is operating on a full data_reference.

One alternative would be to use conditions like:

  first_is_a ? SUB_ACCESS_FN_A (sub) : SUB_ACCESS_FN_B (sub)

but IMO that's less readable than the existing:

  SUB_ACCESS_FN (sub, index)

Or we could have individual access_fn arrays for A and B, separate
from the main subscript array, but that would mean allocating three
arrays instead of one.

>>  /* Returns true iff A divides B.  */
>>
>> @@ -144,6 +143,21 @@ int_divides_p (int a, int b)
>>return ((b % a) == 0);
>>  }
>>
>> +/* Return true if reference REF contains a union access.  */
>> +
>> +static bool
>> +ref_contains_union_access_p (tree ref)
>> +{
>> +  while 

Re: Handle data dependence relations with different bases

2017-05-04 Thread Bin.Cheng
On Wed, May 3, 2017 at 9:00 AM, Richard Sandiford
 wrote:
> This patch tries to calculate conservatively-correct distance
> vectors for two references whose base addresses are not the same.
> It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence
> isn't guaranteed to occur.
>
> The motivating example is:
>
>   struct s { int x[8]; };
>   void
>   f (struct s *a, struct s *b)
>   {
> for (int i = 0; i < 8; ++i)
>   a->x[i] += b->x[i];
>   }
>
> in which the "a" and "b" accesses are either independent or have a
> dependence distance of 0 (assuming -fstrict-aliasing).  Neither case
> prevents vectorisation, so we can vectorise without an alias check.
>
> I'd originally wanted to do the same thing for arrays as well, e.g.:
>
>   void
>   f (int a[][8], struct b[][8])
>   {
> for (int i = 0; i < 8; ++i)
>   a[0][i] += b[0][i];
>   }
>
> I think this is valid because C11 6.7.6.2/6 says:
>
>   For two array types to be compatible, both shall have compatible
>   element types, and if both size specifiers are present, and are
>   integer constant expressions, then both size specifiers shall have
>   the same constant value.
>
> So if we access an array through an int (*)[8], it must have type X[8]
> or X[], where X is compatible with int.  It doesn't seem possible in
> either case for "a[0]" and "b[0]" to overlap when "a != b".
>
> However, Richard B said that (at least in gimple) we support arbitrary
> overlap of arrays and allow arrays to be accessed with different
> dimensionality.  There are examples of this in PR50067.  I've therefore
> only handled references that end in a structure field access.
>
> There are two ways of handling these dependences in the vectoriser:
> use them to limit VF, or check at runtime as before.  I've gone for
> the approach of checking at runtime if we can, to avoid limiting VF
> unnecessarily.  We still fall back to a VF cap when runtime checks
> aren't allowed.
>
> The patch tests whether we queued an alias check with a dependence
> distance of X and then picked a VF <= X, in which case it's safe to
> drop the alias check.  Since vect_prune_runtime_alias_check_list can
> be called twice with different VF for the same loop, it's no longer
> safe to clear may_alias_ddrs on exit.  Instead we should use
> comp_alias_ddrs to check whether versioning is necessary.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
Hi Richard,
It's nice to explore more alias opportunity, below are some simple
comments embedded.
>
> Thanks,
> Richard
>
>
> gcc/
> 2017-05-03  Richard Sandiford  
>
> * tree-data-ref.h (subscript): Add access_fn field.
> (data_dependence_relation): Add could_be_independent_p.
> (SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
> (same_access_functions): Move to tree-data-ref.c.
> * tree-data-ref.c (ref_contains_union_access_p): New function.
> (dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
> DR_ACCESS_FN.
> (constant_access_functions): Likewise.
> (add_other_self_distances): Likewise.
> (same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
> (initialize_data_dependence_relation): Use XCNEW and remove
> explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
> of access functions that have the same type.  Allow the
> subsequence to end with different bases in some circumstances.
> Record the chosen access functions in SUB_ACCESS_FN.
> (build_classic_dist_vector_1): Replace ddr_a and ddr_b with
> a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
> (subscript_dependence_tester_1): Likewise dra and drb.
> (build_classic_dist_vector): Update calls accordingly.
> (subscript_dependence_tester): Likewise.
> * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
> DDR_COULD_BE_INDEPENDENT_P.
> * tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
> comp_alias_ddrs instead of may_alias_ddrs.
> * tree-vect-data-refs.c (vect_analyze_data_ref_dependence): Try
> to mark for aliasing if DDR_COULD_BE_INDEPENDENT_P, but fall back
> to using the recorded distance vectors if that fails.
> (dependence_distance_ge_vf): New function.
> (vect_prune_runtime_alias_test_list): Use it.  Don't clear
> LOOP_VINFO_MAY_ALIAS_DDRS.
>
> gcc/testsuite/
> * gcc.dg/vect/vect-alias-check-3.c: New test.
> * gcc.dg/vect/vect-alias-check-4.c: Likewise.
> * gcc.dg/vect/vect-alias-check-5.c: Likewise.
>
> Index: gcc/tree-data-ref.h
> ===
> --- gcc/tree-data-ref.h 2017-05-03 08:48:11.977015306 +0100
> +++ gcc/tree-data-ref.h 2017-05-03 08:48:48.737038502 +0100
> @@ -191,6 +191,9 @@ struct conflict_function
>
>  struct 

Handle data dependence relations with different bases

2017-05-03 Thread Richard Sandiford
This patch tries to calculate conservatively-correct distance
vectors for two references whose base addresses are not the same.
It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence
isn't guaranteed to occur.

The motivating example is:

  struct s { int x[8]; };
  void
  f (struct s *a, struct s *b)
  {
for (int i = 0; i < 8; ++i)
  a->x[i] += b->x[i];
  }

in which the "a" and "b" accesses are either independent or have a
dependence distance of 0 (assuming -fstrict-aliasing).  Neither case
prevents vectorisation, so we can vectorise without an alias check.

I'd originally wanted to do the same thing for arrays as well, e.g.:

  void
  f (int a[][8], struct b[][8])
  {
for (int i = 0; i < 8; ++i)
  a[0][i] += b[0][i];
  }

I think this is valid because C11 6.7.6.2/6 says:

  For two array types to be compatible, both shall have compatible
  element types, and if both size specifiers are present, and are
  integer constant expressions, then both size specifiers shall have
  the same constant value.

So if we access an array through an int (*)[8], it must have type X[8]
or X[], where X is compatible with int.  It doesn't seem possible in
either case for "a[0]" and "b[0]" to overlap when "a != b".

However, Richard B said that (at least in gimple) we support arbitrary
overlap of arrays and allow arrays to be accessed with different
dimensionality.  There are examples of this in PR50067.  I've therefore
only handled references that end in a structure field access.

There are two ways of handling these dependences in the vectoriser:
use them to limit VF, or check at runtime as before.  I've gone for
the approach of checking at runtime if we can, to avoid limiting VF
unnecessarily.  We still fall back to a VF cap when runtime checks
aren't allowed.

The patch tests whether we queued an alias check with a dependence
distance of X and then picked a VF <= X, in which case it's safe to
drop the alias check.  Since vect_prune_runtime_alias_check_list can
be called twice with different VF for the same loop, it's no longer
safe to clear may_alias_ddrs on exit.  Instead we should use
comp_alias_ddrs to check whether versioning is necessary.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
2017-05-03  Richard Sandiford  

* tree-data-ref.h (subscript): Add access_fn field.
(data_dependence_relation): Add could_be_independent_p.
(SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
(same_access_functions): Move to tree-data-ref.c.
* tree-data-ref.c (ref_contains_union_access_p): New function.
(dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
DR_ACCESS_FN.
(constant_access_functions): Likewise.
(add_other_self_distances): Likewise.
(same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
(initialize_data_dependence_relation): Use XCNEW and remove
explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
of access functions that have the same type.  Allow the
subsequence to end with different bases in some circumstances.
Record the chosen access functions in SUB_ACCESS_FN.
(build_classic_dist_vector_1): Replace ddr_a and ddr_b with
a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
(subscript_dependence_tester_1): Likewise dra and drb.
(build_classic_dist_vector): Update calls accordingly.
(subscript_dependence_tester): Likewise.
* tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
DDR_COULD_BE_INDEPENDENT_P.
* tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
comp_alias_ddrs instead of may_alias_ddrs.
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence): Try
to mark for aliasing if DDR_COULD_BE_INDEPENDENT_P, but fall back
to using the recorded distance vectors if that fails.
(dependence_distance_ge_vf): New function.
(vect_prune_runtime_alias_test_list): Use it.  Don't clear
LOOP_VINFO_MAY_ALIAS_DDRS.

gcc/testsuite/
* gcc.dg/vect/vect-alias-check-3.c: New test.
* gcc.dg/vect/vect-alias-check-4.c: Likewise.
* gcc.dg/vect/vect-alias-check-5.c: Likewise.

Index: gcc/tree-data-ref.h
===
--- gcc/tree-data-ref.h 2017-05-03 08:48:11.977015306 +0100
+++ gcc/tree-data-ref.h 2017-05-03 08:48:48.737038502 +0100
@@ -191,6 +191,9 @@ struct conflict_function
 
 struct subscript
 {
+  /* The access functions of the two references.  */
+  tree access_fn[2];
+
   /* A description of the iterations for which the elements are
  accessed twice.  */
   conflict_function *conflicting_iterations_in_a;
@@ -209,6 +212,7 @@ struct subscript
 
 typedef struct subscript *subscript_p;
 
+#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
 #define