Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation

2016-10-13 Thread Bin.Cheng
On Mon, Sep 12, 2016 at 8:58 PM, Jeff Law  wrote:
> On 09/06/2016 12:54 PM, Bin Cheng wrote:
>>
>> Hi,
>> LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could
>> overflow in loop niters' type.  Vectorizer needs to generate more code
>> computing vectorized niters if overflow does happen.  However, For common
>> loops, there is no overflow actually, this patch tries to prove the
>> no-overflow information and use that to improve code generation.  At the
>> moment, no-overflow information comes either from loop niter analysis, or
>> the truth that we know loop is peeled for non-zero iterations in prologue
>> peeling.  For the latter case, it doesn't matter if the original
>> LOOP_VINFO_NITERS overflows or not, because computation LOOP_VINFO_NITERS -
>> LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by underflow.
>>
>> Thanks,
>> bin
>>
>> 2016-09-01  Bin Cheng  
>>
>> * tree-vect-loop.c (loop_niters_no_overflow): New func.
>> (vect_transform_loop): Call loop_niters_no_overflow.  Pass the
>> no-overflow information to vect_do_peeling_for_loop_bound and
>> vect_gen_vector_loop_niters.
>>
> OK when prereqs are all approved.
Hi,
I revised this patch using widest_int comparison for trees, rather
than int.  Attached new patch is committed.  Also committed all
patches in peel refactoring patch set, they are posted at:
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00326.html
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01012.html
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00328.html
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00329.html
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00330.html
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00331.html
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00332.html
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00333.html

The patch set bootstrap and test again on x86_64 and AArch64.  No
regression found.
I will keep eyes on possible fallouts.

Thanks,
bin

>
> jeff
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 0470445..9cca9b7 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6620,6 +6620,39 @@ vect_loop_kill_debug_uses (struct loop *loop, gimple 
*stmt)
 }
 }
 
+/* Given loop represented by LOOP_VINFO, return true if computation of
+   LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false
+   otherwise.  */
+
+static bool
+loop_niters_no_overflow (loop_vec_info loop_vinfo)
+{
+  /* Constant case.  */
+  if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
+{
+  tree cst_niters = LOOP_VINFO_NITERS (loop_vinfo);
+  tree cst_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo);
+
+  gcc_assert (TREE_CODE (cst_niters) == INTEGER_CST);
+  gcc_assert (TREE_CODE (cst_nitersm1) == INTEGER_CST);
+  if (wi::to_widest (cst_nitersm1) < wi::to_widest (cst_niters))
+   return true;
+}
+
+  widest_int max;
+  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
+  /* Check the upper bound of loop niters.  */
+  if (get_max_loop_iterations (loop, ))
+{
+  tree type = TREE_TYPE (LOOP_VINFO_NITERS (loop_vinfo));
+  signop sgn = TYPE_SIGN (type);
+  widest_int type_max = widest_int::from (wi::max_value (type), sgn);
+  if (max < type_max)
+   return true;
+}
+  return false;
+}
+
 /* Function vect_transform_loop.
 
The analysis phase has determined that the loop is vectorizable.
@@ -6707,8 +6740,9 @@ vect_transform_loop (loop_vec_info loop_vinfo)
   tree niters = vect_build_loop_niters (loop_vinfo);
   LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo) = niters;
   tree nitersm1 = unshare_expr (LOOP_VINFO_NITERSM1 (loop_vinfo));
+  bool niters_no_overflow = loop_niters_no_overflow (loop_vinfo);
   vect_do_peeling (loop_vinfo, niters, nitersm1, _vector, th,
-  check_profitability, false);
+  check_profitability, niters_no_overflow);
   if (niters_vector == NULL_TREE)
 {
   if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
@@ -6717,7 +6751,7 @@ vect_transform_loop (loop_vec_info loop_vinfo)
   LOOP_VINFO_INT_NITERS (loop_vinfo) / vf);
   else
vect_gen_vector_loop_niters (loop_vinfo, niters, _vector,
-false);
+niters_no_overflow);
 }
 
   /* 1) Make sure the loop header has exactly two entries


Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation

2016-09-12 Thread Jeff Law

On 09/06/2016 12:54 PM, Bin Cheng wrote:

Hi,
LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could overflow 
in loop niters' type.  Vectorizer needs to generate more code computing 
vectorized niters if overflow does happen.  However, For common loops, there is 
no overflow actually, this patch tries to prove the no-overflow information and 
use that to improve code generation.  At the moment, no-overflow information 
comes either from loop niter analysis, or the truth that we know loop is peeled 
for non-zero iterations in prologue peeling.  For the latter case, it doesn't 
matter if the original LOOP_VINFO_NITERS overflows or not, because computation 
LOOP_VINFO_NITERS - LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by 
underflow.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop.c (loop_niters_no_overflow): New func.
(vect_transform_loop): Call loop_niters_no_overflow.  Pass the
no-overflow information to vect_do_peeling_for_loop_bound and
vect_gen_vector_loop_niters.


OK when prereqs are all approved.

jeff


Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation

2016-09-07 Thread kugan

Hi Bin,

On 07/09/16 17:52, Bin.Cheng wrote:

On Wed, Sep 7, 2016 at 1:10 AM, kugan  wrote:

Hi Bin,


On 07/09/16 04:54, Bin Cheng wrote:


Hi,
LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could
overflow in loop niters' type.  Vectorizer needs to generate more code
computing vectorized niters if overflow does happen.  However, For common
loops, there is no overflow actually, this patch tries to prove the
no-overflow information and use that to improve code generation.  At the
moment, no-overflow information comes either from loop niter analysis, or
the truth that we know loop is peeled for non-zero iterations in prologue
peeling.  For the latter case, it doesn't matter if the original
LOOP_VINFO_NITERS overflows or not, because computation LOOP_VINFO_NITERS -
LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by underflow.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop.c (loop_niters_no_overflow): New func.
(vect_transform_loop): Call loop_niters_no_overflow.  Pass the
no-overflow information to vect_do_peeling_for_loop_bound and
vect_gen_vector_loop_niters.


009-prove-no_overflow-for-vect-niters-20160902.txt


diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 0d37f55..2ef1f9b 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6610,6 +6610,38 @@ vect_loop_kill_debug_uses (struct loop *loop,
gimple *stmt)
 }
 }

+/* Given loop represented by LOOP_VINFO, return true if computation of
+   LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false
+   otherwise.  */
+
+static bool
+loop_niters_no_overflow (loop_vec_info loop_vinfo)
+{
+  /* Constant case.  */
+  if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
+{
+  int cst_niters = LOOP_VINFO_INT_NITERS (loop_vinfo);



Wouldn't it truncate by assigning this to int?

Probably, now I think it's unnecessary to use int version niters here,
LOOP_VINFO_NITERS can be used directly.




+  tree cst_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo);
+
+  gcc_assert (TREE_CODE (cst_nitersm1) == INTEGER_CST);
+  if (wi::to_widest (cst_nitersm1) < cst_niters)



Shouldn't you have do the addition and comparison in the type of the loop
index instead of widest_int to see if that overflows?

You mean the type of loop niters?  NITERS is computed from NITERSM1 +
1, I don't think we need to do it again here.


Imagine that you have LOOP_VINFO_NITERSM1 as TYPE_MAX (loop niters 
type). In this case, when you add 1, it will overflow in loop niters 
type but not when you do the computation in widest_int.


But, as you said, if NITERS is already computed in loop niters type, yes 
this compare should be sufficient.


You could do the comparison as wide_int or tree. I think, this would 
make it clearer.


Thanks,
Kugan



Thanks,
bin



Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation

2016-09-07 Thread Bin.Cheng
On Wed, Sep 7, 2016 at 1:10 AM, kugan  wrote:
> Hi Bin,
>
>
> On 07/09/16 04:54, Bin Cheng wrote:
>>
>> Hi,
>> LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could
>> overflow in loop niters' type.  Vectorizer needs to generate more code
>> computing vectorized niters if overflow does happen.  However, For common
>> loops, there is no overflow actually, this patch tries to prove the
>> no-overflow information and use that to improve code generation.  At the
>> moment, no-overflow information comes either from loop niter analysis, or
>> the truth that we know loop is peeled for non-zero iterations in prologue
>> peeling.  For the latter case, it doesn't matter if the original
>> LOOP_VINFO_NITERS overflows or not, because computation LOOP_VINFO_NITERS -
>> LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by underflow.
>>
>> Thanks,
>> bin
>>
>> 2016-09-01  Bin Cheng  
>>
>> * tree-vect-loop.c (loop_niters_no_overflow): New func.
>> (vect_transform_loop): Call loop_niters_no_overflow.  Pass the
>> no-overflow information to vect_do_peeling_for_loop_bound and
>> vect_gen_vector_loop_niters.
>>
>>
>> 009-prove-no_overflow-for-vect-niters-20160902.txt
>>
>>
>> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
>> index 0d37f55..2ef1f9b 100644
>> --- a/gcc/tree-vect-loop.c
>> +++ b/gcc/tree-vect-loop.c
>> @@ -6610,6 +6610,38 @@ vect_loop_kill_debug_uses (struct loop *loop,
>> gimple *stmt)
>>  }
>>  }
>>
>> +/* Given loop represented by LOOP_VINFO, return true if computation of
>> +   LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false
>> +   otherwise.  */
>> +
>> +static bool
>> +loop_niters_no_overflow (loop_vec_info loop_vinfo)
>> +{
>> +  /* Constant case.  */
>> +  if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
>> +{
>> +  int cst_niters = LOOP_VINFO_INT_NITERS (loop_vinfo);
>
>
> Wouldn't it truncate by assigning this to int?
Probably, now I think it's unnecessary to use int version niters here,
LOOP_VINFO_NITERS can be used directly.
>
>
>> +  tree cst_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo);
>> +
>> +  gcc_assert (TREE_CODE (cst_nitersm1) == INTEGER_CST);
>> +  if (wi::to_widest (cst_nitersm1) < cst_niters)
>
>
> Shouldn't you have do the addition and comparison in the type of the loop
> index instead of widest_int to see if that overflows?
You mean the type of loop niters?  NITERS is computed from NITERSM1 +
1, I don't think we need to do it again here.

Thanks,
bin


Re: [PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation

2016-09-06 Thread kugan

Hi Bin,

On 07/09/16 04:54, Bin Cheng wrote:

Hi,
LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could overflow 
in loop niters' type.  Vectorizer needs to generate more code computing 
vectorized niters if overflow does happen.  However, For common loops, there is 
no overflow actually, this patch tries to prove the no-overflow information and 
use that to improve code generation.  At the moment, no-overflow information 
comes either from loop niter analysis, or the truth that we know loop is peeled 
for non-zero iterations in prologue peeling.  For the latter case, it doesn't 
matter if the original LOOP_VINFO_NITERS overflows or not, because computation 
LOOP_VINFO_NITERS - LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by 
underflow.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop.c (loop_niters_no_overflow): New func.
(vect_transform_loop): Call loop_niters_no_overflow.  Pass the
no-overflow information to vect_do_peeling_for_loop_bound and
vect_gen_vector_loop_niters.


009-prove-no_overflow-for-vect-niters-20160902.txt


diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 0d37f55..2ef1f9b 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6610,6 +6610,38 @@ vect_loop_kill_debug_uses (struct loop *loop, gimple 
*stmt)
 }
 }

+/* Given loop represented by LOOP_VINFO, return true if computation of
+   LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false
+   otherwise.  */
+
+static bool
+loop_niters_no_overflow (loop_vec_info loop_vinfo)
+{
+  /* Constant case.  */
+  if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
+{
+  int cst_niters = LOOP_VINFO_INT_NITERS (loop_vinfo);


Wouldn't it truncate by assigning this to int?



+  tree cst_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo);
+
+  gcc_assert (TREE_CODE (cst_nitersm1) == INTEGER_CST);
+  if (wi::to_widest (cst_nitersm1) < cst_niters)


Shouldn't you have do the addition and comparison in the type of the 
loop index instead of widest_int to see if that overflows?


Thanks,
Kugan


+   return true;
+}
+
+  widest_int max;
+  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
+  /* Check the upper bound of loop niters.  */
+  if (get_max_loop_iterations (loop, ))
+{
+  tree type = TREE_TYPE (LOOP_VINFO_NITERS (loop_vinfo));
+  signop sgn = TYPE_SIGN (type);
+  widest_int type_max = widest_int::from (wi::max_value (type), sgn);
+  if (max < type_max)
+   return true;
+}
+  return false;
+}
+
 /* Function vect_transform_loop.

The analysis phase has determined that the loop is vectorizable.
@@ -6697,8 +6729,9 @@ vect_transform_loop (loop_vec_info loop_vinfo)
   tree niters = vect_build_loop_niters (loop_vinfo);
   LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo) = niters;
   tree nitersm1 = unshare_expr (LOOP_VINFO_NITERSM1 (loop_vinfo));
+  bool niters_no_overflow = loop_niters_no_overflow (loop_vinfo);
   vect_do_peeling (loop_vinfo, niters, nitersm1, _vector, th,
-  check_profitability, false);
+  check_profitability, niters_no_overflow);
   if (niters_vector == NULL_TREE)
 {
   if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
@@ -6707,7 +6740,7 @@ vect_transform_loop (loop_vec_info loop_vinfo)
   LOOP_VINFO_INT_NITERS (loop_vinfo) / vf);
   else
vect_gen_vector_loop_niters (loop_vinfo, niters, _vector,
-false);
+niters_no_overflow);
 }

   /* 1) Make sure the loop header has exactly two entries



[PATCH GCC 9/9]Prove no-overflow in computation of LOOP_VINFO_NITERS and improve code generation

2016-09-06 Thread Bin Cheng
Hi,
LOOP_VINFO_NITERS is computed as LOOP_VINFO_NITERSM1 + 1, which could overflow 
in loop niters' type.  Vectorizer needs to generate more code computing 
vectorized niters if overflow does happen.  However, For common loops, there is 
no overflow actually, this patch tries to prove the no-overflow information and 
use that to improve code generation.  At the moment, no-overflow information 
comes either from loop niter analysis, or the truth that we know loop is peeled 
for non-zero iterations in prologue peeling.  For the latter case, it doesn't 
matter if the original LOOP_VINFO_NITERS overflows or not, because computation 
LOOP_VINFO_NITERS - LOOP_VINFO_PEELING_FOR_ALIGNMENT cancels the overflow by 
underflow.

Thanks,
bin

2016-09-01  Bin Cheng  

* tree-vect-loop.c (loop_niters_no_overflow): New func.
(vect_transform_loop): Call loop_niters_no_overflow.  Pass the
no-overflow information to vect_do_peeling_for_loop_bound and
vect_gen_vector_loop_niters.diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 0d37f55..2ef1f9b 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6610,6 +6610,38 @@ vect_loop_kill_debug_uses (struct loop *loop, gimple 
*stmt)
 }
 }
 
+/* Given loop represented by LOOP_VINFO, return true if computation of
+   LOOP_VINFO_NITERS (= LOOP_VINFO_NITERSM1 + 1) doesn't overflow, false
+   otherwise.  */
+
+static bool
+loop_niters_no_overflow (loop_vec_info loop_vinfo)
+{
+  /* Constant case.  */
+  if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
+{
+  int cst_niters = LOOP_VINFO_INT_NITERS (loop_vinfo);
+  tree cst_nitersm1 = LOOP_VINFO_NITERSM1 (loop_vinfo);
+
+  gcc_assert (TREE_CODE (cst_nitersm1) == INTEGER_CST);
+  if (wi::to_widest (cst_nitersm1) < cst_niters)
+   return true;
+}
+
+  widest_int max;
+  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
+  /* Check the upper bound of loop niters.  */
+  if (get_max_loop_iterations (loop, ))
+{
+  tree type = TREE_TYPE (LOOP_VINFO_NITERS (loop_vinfo));
+  signop sgn = TYPE_SIGN (type);
+  widest_int type_max = widest_int::from (wi::max_value (type), sgn);
+  if (max < type_max)
+   return true;
+}
+  return false;
+}
+
 /* Function vect_transform_loop.
 
The analysis phase has determined that the loop is vectorizable.
@@ -6697,8 +6729,9 @@ vect_transform_loop (loop_vec_info loop_vinfo)
   tree niters = vect_build_loop_niters (loop_vinfo);
   LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo) = niters;
   tree nitersm1 = unshare_expr (LOOP_VINFO_NITERSM1 (loop_vinfo));
+  bool niters_no_overflow = loop_niters_no_overflow (loop_vinfo);
   vect_do_peeling (loop_vinfo, niters, nitersm1, _vector, th,
-  check_profitability, false);
+  check_profitability, niters_no_overflow);
   if (niters_vector == NULL_TREE)
 {
   if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
@@ -6707,7 +6740,7 @@ vect_transform_loop (loop_vec_info loop_vinfo)
   LOOP_VINFO_INT_NITERS (loop_vinfo) / vf);
   else
vect_gen_vector_loop_niters (loop_vinfo, niters, _vector,
-false);
+niters_no_overflow);
 }
 
   /* 1) Make sure the loop header has exactly two entries