Re: [patch tree-optimization]: Fix for PR 45397 part 1 of 2

2012-03-16 Thread Richard Guenther
On Thu, Mar 15, 2012 at 3:34 PM, Kai Tietz  wrote:
> 2012/3/15 Richard Guenther :
>> On Thu, Mar 15, 2012 at 3:00 PM, Jakub Jelinek  wrote:
>>> On Thu, Mar 15, 2012 at 02:53:10PM +0100, Kai Tietz wrote:
 > This looks like to match unbound pattern sizes and thus does not fit
 > into the forwprop machinery.  Instead it was suggested elsewhere
 > that promoting / demoting registers should be done in a separate pass
 > where you can compute a lattice of used bits and apply a transform
 > based on that lattice and target information (according to PROMOTE_MODE
 > for example).

 Well, the integer truncation part might be something for a separate
 pass.  It could then also take care that within single-use
 gimple-statements the integral-constant is always on right-hand-side
 of first statement of an +, -, |, ^, &, and mul.

 But the cast-hoisting code itself is not unbound AFAICS and has fixed
 pattern size.
>>
>> Can you split that part out then please?
>
> I can do.  In fact just the part of calling
>
> Sure, it would be the removal of the function truncate_integers and its call.
>
>>> The type demotion is PR45397/PR47477 among other PRs.
>>> I'd just walk from the narrowing integer conversion stmts recursively
>>> through the def stmts, see if they can be narrowed, note it, and finally if
>>> everything or significant portion of the stmts can be demoted (if not all,
>>> with some narrowing integer conversion stmt inserted), do it all together.
>
> Jakub, this might be something good to have it in separate pass.
> Right now I need to avoid some type-hoisting in forward-propagation,
> as otherwise it would loop endless caused by type-sinking code in
> forward-propagation.

Well, that sounds like either of it is not applying costs properly.  We should
have a total ordering cost-wise on both forms.  Which forms do we iterate
on?

> Only question would be where such pass would be
> best placed.  After or before forward-propagation pass?

Somewhere before loop optimizations (especially IVOPTs).  Generally
it might help SCEV analysis, so I'd do it before PRE, after the CCP/copy-prop
passes.

>> For PROMOTE_MODE targets you'd promote but properly mask out
>> constants (to make them cheaper to generate, for example).  You'd
>> also take advantate of targets that can do zero/sign-extending loads
>> without extra cost (ISTR that's quite important for some SPEC 2k6
>> benchmark on x86_64).
>
> Hmm, as we are talking about truncation-casts here, what is the reaons
> for PROMOTE_MODE here?
> You mean to generate for a PROMOTE_MODE target explicit something like:
> D1 = 0x80
> D2 = (int) D1
>
> instead of having D1 = 0xff80.  Isn't that a decision done on RTL level?

PROMOTE_MODE is about targets only having basic operations in
the mode PROMOTE_MODE promotes to.  For example (IIRC) powerpc
can only do arithmetic on full register width, not on arbitrarily sub-regs
as i386.  Which means that if you need sub-reg precision values we have
to insert truncataions after every operation on RTL.  Thus, if you artificially
lower precision of computations on the tree level this will pessimize things
on RTL.  So, you never should narrow operations below what PROMOTE_MODE
would do.  In fact, you might as well expose the PROMOTE_MODE fact on
the tree level.

> Another interesting type-hoisting part is to check if a type-sign
> change might be profitable.
>
> Eg:
> signed char D0, D1, D5;
> int D2,D3,D4
> ...
> D2 = (int) D0
> D3 = (int) D1
> D4 = D2 + D3
> D5 = (signed char) D4
> D6 = D5 == 0x8f
>
> to
>
> signed char D0, D1, D5;
> unsigned char D2,D3,D4
> ...
> D2 = (unsigned char) D0
> D3 = (unsigned char) D1
> D4 = D2 + D3
> D5 = D4 == 0x7fu

You need to watch for undefined overflow issues on the narrowed expressions
anyway (thus, have to resort to unsigned arithmetic in nearly all cases).

Richard.

>> Richard.
>>
>>>        Jakub
>
> Regards,
> Kai


Re: [patch tree-optimization]: Fix for PR 45397 part 1 of 2

2012-03-15 Thread Michael Matz
Hi,

On Thu, 15 Mar 2012, Richard Guenther wrote:

> > The type demotion is PR45397/PR47477 among other PRs. I'd just walk 
> > from the narrowing integer conversion stmts recursively through the 
> > def stmts, see if they can be narrowed, note it, and finally if 
> > everything or significant portion of the stmts can be demoted (if not 
> > all, with some narrowing integer conversion stmt inserted), do it all 
> > together.
> 
> For PROMOTE_MODE targets you'd promote but properly mask out
> constants (to make them cheaper to generate, for example).  You'd
> also take advantate of targets that can do zero/sign-extending loads
> without extra cost (ISTR that's quite important for some SPEC 2k6
> benchmark on x86_64).

gamess.  I still have an old proof of concept patch doing type promotion.  
Probably doesn't apply anymore, and it's using too broad predicates (it 
simple-mindedly extends to the largest type see in an expression tree).
But I think that basic idea of it is sound.


Ciao,
Michael.

Index: passes.c
===
--- passes.c(revision 159226)
+++ passes.c(working copy)
@@ -831,6 +831,7 @@ init_optimization_passes (void)
   NEXT_PASS (pass_all_optimizations);
 {
   struct opt_pass **p = &pass_all_optimizations.pass.sub;
+  extern struct gimple_opt_pass pass_bprop_extends;
   NEXT_PASS (pass_remove_cgraph_callee_edges);
   /* Initial scalar cleanups before alias computation.
 They ensure memory accesses are not indirect wherever possible.  */
@@ -838,6 +839,7 @@ init_optimization_passes (void)
   NEXT_PASS (pass_update_address_taken);
   NEXT_PASS (pass_rename_ssa_copies);
   NEXT_PASS (pass_complete_unrolli);
+  NEXT_PASS (pass_bprop_extends);
   NEXT_PASS (pass_ccp);
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_call_cdce);
Index: tree-ssa-ccp.c
===
--- tree-ssa-ccp.c  (revision 159226)
+++ tree-ssa-ccp.c  (working copy)
@@ -1999,3 +1999,263 @@ struct gimple_opt_pass pass_fold_builtin
 | TODO_update_ssa  /* todo_flags_finish */
  }
 };
+
+#if 1
+static bool
+promote_through_insn_p (gimple def, tree *prhs1, tree *prhs2)
+{
+  tree lhs, rhs1, rhs2;
+  if (!is_gimple_assign (def))
+return false;
+  lhs = gimple_assign_lhs (def);
+  rhs1 = rhs2 = NULL;
+  switch (gimple_assign_rhs_class (def))
+{
+  case GIMPLE_SINGLE_RHS:
+   rhs1 = gimple_assign_rhs1 (def);
+   if (TREE_CODE (rhs1) != SSA_NAME)
+ return false;
+   break;
+  case GIMPLE_UNARY_RHS:
+   rhs1 = gimple_assign_rhs1 (def);
+   if (TREE_TYPE (gimple_expr_type (def)) != TREE_TYPE (rhs1))
+ return false;
+   break;
+  case GIMPLE_BINARY_RHS:
+   rhs1 = gimple_assign_rhs1 (def);
+   rhs2 = gimple_assign_rhs2 (def);
+
+   switch (gimple_assign_rhs_code (def))
+ {
+   case LSHIFT_EXPR: case RSHIFT_EXPR:
+   case LROTATE_EXPR: case RROTATE_EXPR:
+ rhs2 = NULL;
+ if (TREE_TYPE (lhs) != TREE_TYPE (rhs1))
+   return false;
+ break;
+   case PLUS_EXPR: case MINUS_EXPR:
+   case MULT_EXPR: case EXACT_DIV_EXPR:
+   case TRUNC_DIV_EXPR: case CEIL_DIV_EXPR:
+   case FLOOR_DIV_EXPR: case ROUND_DIV_EXPR:
+   case TRUNC_MOD_EXPR: case CEIL_MOD_EXPR:
+   case FLOOR_MOD_EXPR: case ROUND_MOD_EXPR:
+   case RDIV_EXPR:
+   case MIN_EXPR: case MAX_EXPR:
+   case BIT_IOR_EXPR: case BIT_XOR_EXPR: case BIT_AND_EXPR:
+ if (TREE_TYPE (lhs) != TREE_TYPE (rhs1)
+ || TREE_TYPE (lhs) != TREE_TYPE (rhs2))
+   return false;
+ break;
+   default:
+ return false;
+ }
+   break;
+  default:
+   return false;
+}
+  if (rhs1 && TREE_CODE (rhs1) != SSA_NAME)
+rhs1 = NULL;
+  if (prhs1)
+*prhs1 = rhs1;
+  if (rhs2 && TREE_CODE (rhs2) != SSA_NAME)
+rhs2 = NULL;
+  if (prhs2)
+*prhs2 = rhs2;
+  return true;
+}
+
+static tree
+get_extended_version (tree newtype, tree name, bool force)
+{
+  tree ret = TREE_CHAIN (name);
+  tree rhs1, rhs2;
+  gimple def;
+  /* If we already have a version of NAME, try to use it.  If it
+ doesn't match in type, fail.  */
+  if (ret)
+{
+  if (TREE_TYPE (ret) == newtype)
+   return ret;
+  else
+   return NULL_TREE;
+}
+  def = SSA_NAME_DEF_STMT (name);
+  /* If we can propagate through our defining insn, try to do that.  */
+  if (promote_through_insn_p (def, &rhs1, &rhs2))
+{
+  gimple stmt;
+  tree extrhs1, extrhs2;
+  gimple_stmt_iterator gsi;
+  enum tree_code code;
+  if (rhs1)
+   {
+ extrhs1 = get_extended_version (newtype, rhs1, true);
+ if (!extrhs1)
+   /* ??? We could force here.  */
+   return NULL_TREE;
+   }
+

Re: [patch tree-optimization]: Fix for PR 45397 part 1 of 2

2012-03-15 Thread Kai Tietz
2012/3/15 Richard Guenther :
> On Thu, Mar 15, 2012 at 3:00 PM, Jakub Jelinek  wrote:
>> On Thu, Mar 15, 2012 at 02:53:10PM +0100, Kai Tietz wrote:
>>> > This looks like to match unbound pattern sizes and thus does not fit
>>> > into the forwprop machinery.  Instead it was suggested elsewhere
>>> > that promoting / demoting registers should be done in a separate pass
>>> > where you can compute a lattice of used bits and apply a transform
>>> > based on that lattice and target information (according to PROMOTE_MODE
>>> > for example).
>>>
>>> Well, the integer truncation part might be something for a separate
>>> pass.  It could then also take care that within single-use
>>> gimple-statements the integral-constant is always on right-hand-side
>>> of first statement of an +, -, |, ^, &, and mul.
>>>
>>> But the cast-hoisting code itself is not unbound AFAICS and has fixed
>>> pattern size.
>
> Can you split that part out then please?

I can do.  In fact just the part of calling

Sure, it would be the removal of the function truncate_integers and its call.

>> The type demotion is PR45397/PR47477 among other PRs.
>> I'd just walk from the narrowing integer conversion stmts recursively
>> through the def stmts, see if they can be narrowed, note it, and finally if
>> everything or significant portion of the stmts can be demoted (if not all,
>> with some narrowing integer conversion stmt inserted), do it all together.

Jakub, this might be something good to have it in separate pass.
Right now I need to avoid some type-hoisting in forward-propagation,
as otherwise it would loop endless caused by type-sinking code in
forward-propagation.  Only question would be where such pass would be
best placed.  After or before forward-propagation pass?

> For PROMOTE_MODE targets you'd promote but properly mask out
> constants (to make them cheaper to generate, for example).  You'd
> also take advantate of targets that can do zero/sign-extending loads
> without extra cost (ISTR that's quite important for some SPEC 2k6
> benchmark on x86_64).

Hmm, as we are talking about truncation-casts here, what is the reaons
for PROMOTE_MODE here?
You mean to generate for a PROMOTE_MODE target explicit something like:
D1 = 0x80
D2 = (int) D1

instead of having D1 = 0xff80.  Isn't that a decision done on RTL level?

Another interesting type-hoisting part is to check if a type-sign
change might be profitable.

Eg:
signed char D0, D1, D5;
int D2,D3,D4
...
D2 = (int) D0
D3 = (int) D1
D4 = D2 + D3
D5 = (signed char) D4
D6 = D5 == 0x8f

to

signed char D0, D1, D5;
unsigned char D2,D3,D4
...
D2 = (unsigned char) D0
D3 = (unsigned char) D1
D4 = D2 + D3
D5 = D4 == 0x7fu

> Richard.
>
>>        Jakub

Regards,
Kai


Re: [patch tree-optimization]: Fix for PR 45397 part 1 of 2

2012-03-15 Thread Richard Guenther
On Thu, Mar 15, 2012 at 3:00 PM, Jakub Jelinek  wrote:
> On Thu, Mar 15, 2012 at 02:53:10PM +0100, Kai Tietz wrote:
>> > This looks like to match unbound pattern sizes and thus does not fit
>> > into the forwprop machinery.  Instead it was suggested elsewhere
>> > that promoting / demoting registers should be done in a separate pass
>> > where you can compute a lattice of used bits and apply a transform
>> > based on that lattice and target information (according to PROMOTE_MODE
>> > for example).
>>
>> Well, the integer truncation part might be something for a separate
>> pass.  It could then also take care that within single-use
>> gimple-statements the integral-constant is always on right-hand-side
>> of first statement of an +, -, |, ^, &, and mul.
>>
>> But the cast-hoisting code itself is not unbound AFAICS and has fixed
>> pattern size.

Can you split that part out then please?

> The type demotion is PR45397/PR47477 among other PRs.
> I'd just walk from the narrowing integer conversion stmts recursively
> through the def stmts, see if they can be narrowed, note it, and finally if
> everything or significant portion of the stmts can be demoted (if not all,
> with some narrowing integer conversion stmt inserted), do it all together.

For PROMOTE_MODE targets you'd promote but properly mask out
constants (to make them cheaper to generate, for example).  You'd
also take advantate of targets that can do zero/sign-extending loads
without extra cost (ISTR that's quite important for some SPEC 2k6
benchmark on x86_64).

Richard.

>        Jakub


Re: [patch tree-optimization]: Fix for PR 45397 part 1 of 2

2012-03-15 Thread Jakub Jelinek
On Thu, Mar 15, 2012 at 02:53:10PM +0100, Kai Tietz wrote:
> > This looks like to match unbound pattern sizes and thus does not fit
> > into the forwprop machinery.  Instead it was suggested elsewhere
> > that promoting / demoting registers should be done in a separate pass
> > where you can compute a lattice of used bits and apply a transform
> > based on that lattice and target information (according to PROMOTE_MODE
> > for example).
> 
> Well, the integer truncation part might be something for a separate
> pass.  It could then also take care that within single-use
> gimple-statements the integral-constant is always on right-hand-side
> of first statement of an +, -, |, ^, &, and mul.
> 
> But the cast-hoisting code itself is not unbound AFAICS and has fixed
> pattern size.

The type demotion is PR45397/PR47477 among other PRs.
I'd just walk from the narrowing integer conversion stmts recursively
through the def stmts, see if they can be narrowed, note it, and finally if
everything or significant portion of the stmts can be demoted (if not all,
with some narrowing integer conversion stmt inserted), do it all together.

Jakub


Re: [patch tree-optimization]: Fix for PR 45397 part 1 of 2

2012-03-15 Thread Kai Tietz
2012/3/15 Richard Guenther :
> On Thu, Mar 15, 2012 at 2:08 PM, Kai Tietz  wrote
>> Hi,
>>
>> The solution for this PR is a mix out of different issues.  First is
>> of course the type-hoisting, but also
>> it shows some lacks in simplifications on integer-values, and on equal
>> and none-equal
>> comparisons.
>> The first patch adds to forward-propagation the ability to do type-hoisting
>> for some conversion operations and do simplification for inner binary
>> operations on it.
>> Most important part is here the adjustment of constant integer-values
>> in statement-lists
>> for a truncation.
>> I limited that patch to handle in compare_equal_optimize_1 only
>> bitwise-and operations
>> inner a truncation-cast.  Of course for bitwise-xor/or operations some
>> more simplifications
>> are possible.
>> This patch just does the type-hoisting part.  In a second patch I add
>> to compare_equal_optimize_1
>> the ability for further required simplifications for fixing this problem.
>
> This looks like to match unbound pattern sizes and thus does not fit
> into the forwprop machinery.  Instead it was suggested elsewhere
> that promoting / demoting registers should be done in a separate pass
> where you can compute a lattice of used bits and apply a transform
> based on that lattice and target information (according to PROMOTE_MODE
> for example).
>
> Richard.

Well, the integer truncation part might be something for a separate
pass.  It could then also take care that within single-use
gimple-statements the integral-constant is always on right-hand-side
of first statement of an +, -, |, ^, &, and mul.

But the cast-hoisting code itself is not unbound AFAICS and has fixed
pattern size.

Regards,
Kai


Re: [patch tree-optimization]: Fix for PR 45397 part 1 of 2

2012-03-15 Thread Richard Guenther
On Thu, Mar 15, 2012 at 2:08 PM, Kai Tietz  wrote
> Hi,
>
> The solution for this PR is a mix out of different issues.  First is
> of course the type-hoisting, but also
> it shows some lacks in simplifications on integer-values, and on equal
> and none-equal
> comparisons.
> The first patch adds to forward-propagation the ability to do type-hoisting
> for some conversion operations and do simplification for inner binary
> operations on it.
> Most important part is here the adjustment of constant integer-values
> in statement-lists
> for a truncation.
> I limited that patch to handle in compare_equal_optimize_1 only
> bitwise-and operations
> inner a truncation-cast.  Of course for bitwise-xor/or operations some
> more simplifications
> are possible.
> This patch just does the type-hoisting part.  In a second patch I add
> to compare_equal_optimize_1
> the ability for further required simplifications for fixing this problem.

This looks like to match unbound pattern sizes and thus does not fit
into the forwprop machinery.  Instead it was suggested elsewhere
that promoting / demoting registers should be done in a separate pass
where you can compute a lattice of used bits and apply a transform
based on that lattice and target information (according to PROMOTE_MODE
for example).

Richard.

> ChangeLog
>
> 2012-03-15  Kai Tietz  
>
>        PR tree-optimization/45397
>        * tree-ssa-forwprop.c (compare_equal_optimize_1): New
>        function.
>        (combine_cond_expr_cond): Use compare_equal_optimize_1
>        function.
>        (truncate_integers): New function.
>        (combine_inner_conversion): Likewise.
>        (ssa_forward_propagate_and_combine): Use for conversions
>        combine_inner_conversion function.
>
> 2012-03-15  Kai Tietz  
>
>        * gcc.dg/tree-ssa/pr45397-1.c: New testcase.
>
> Regression tested for all languages (including Ada and Obj-C) on
> x86_64-unknown-linux-gnu.  Ok for apply?
>
> Regards,
> Kai
>
> Index: gcc-trunk/gcc/tree-ssa-forwprop.c
> ===
> --- gcc-trunk.orig/gcc/tree-ssa-forwprop.c
> +++ gcc-trunk/gcc/tree-ssa-forwprop.c
> @@ -362,6 +362,150 @@ rhs_to_tree (tree type, gimple stmt)
>     gcc_unreachable ();
>  }
>
> +/* This function does simplifications of comparison OP0 ==/!= OP1
> while integral
> +   typed operands
> +   On success new statement's TREE is returned, otherwise NULL_TREE.  */
> +
> +static tree
> +compare_equal_optimize_1 (gimple stmt, enum tree_code code, tree type,
> +                         tree op0, tree op1)
> +{
> +  gimple_stmt_iterator gsi;
> +  tree type_outer;
> +  tree type_inner, op_inner;
> +  tree op1_l, op1_r, tem;
> +  gimple newop;
> +
> +  type_outer = TREE_TYPE (op0);
> +  if ((code != EQ_EXPR && code != NE_EXPR)
> +      || !INTEGRAL_TYPE_P (type_outer))
> +    return NULL_TREE;
> +
> +  /* If OP0 isn't a conversion, then check if OP1 might be one.  If so
> +     swap arguments, otherwise return NULL_TREE.  */
> +  if (!CONVERT_EXPR_P (op0))
> +    {
> +      if (!CONVERT_EXPR_P (op1))
> +        return NULL_TREE;
> +      tem = op0;
> +      op0 = op1;
> +      op1 = tem;
> +    }
> +
> +  op_inner = TREE_OPERAND (op0, 0);
> +  type_inner = TREE_TYPE (op_inner);
> +
> +  /* Operations only apply to integral typed operands of cast.  */
> +  if (!INTEGRAL_TYPE_P (type_inner))
> +    return NULL_TREE;
> +
> +  /* If second operand is also a type-conversion, check that underlying 
> operand
> +     is integral typed.  */
> +  if (CONVERT_EXPR_P (op1)
> +      && !INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op1, 0
> +    return NULL_TREE;
> +
> +  /* Simplify ((type) X ==/!= (type) X) to true/false, if X has no 
> side-effects
> +     and is integral typed.  */
> +  if (CONVERT_EXPR_P (op1)
> +      && TREE_OPERAND (op0, 0) == TREE_OPERAND (op1, 0)
> +      && !TREE_SIDE_EFFECTS (TREE_OPERAND (op0, 0)))
> +    return fold_convert (type, (code == EQ_EXPR ? boolean_true_node
> +                                               : boolean_false_node));
> +
> +  /* Simplify (type) X ==/!= (type) Y to X ==/!= Y, if types of X and Y are
> +     compatible and type-precision of X is smaller or equal to TYPE's.  */
> +  if (CONVERT_EXPR_P (op1)
> +      && TYPE_PRECISION (type_inner) <= TYPE_PRECISION (type_outer))
> +    {
> +      tem = TREE_OPERAND (op1, 0);
> +      if (!useless_type_conversion_p (type_inner, TREE_TYPE (tem)))
> +       return NULL_TREE;
> +      return fold_build2_loc (gimple_location (stmt), code, type,
> +                             op_inner, tem);
> +    }
> +
> +  /* Verify that for pattern 'OP0 = (type) X' the type of X is of
> integral kind,
> +     is unsigned, and has smaller or equal precision to type TYPE.  */
> +  if (TYPE_PRECISION (type_inner) > TYPE_PRECISION (type_outer)
> +      || !TYPE_UNSIGNED (type_inner))
> +    return NULL_TREE;
> +
> +  /* Simplify (type) X ==/!= CST to X ==/!= CST' with CST'=(type-of-X) CST.  
> */
> +  if (TREE_

[patch tree-optimization]: Fix for PR 45397 part 1 of 2

2012-03-15 Thread Kai Tietz
Hi,

The solution for this PR is a mix out of different issues.  First is
of course the type-hoisting, but also
it shows some lacks in simplifications on integer-values, and on equal
and none-equal
comparisons.
The first patch adds to forward-propagation the ability to do type-hoisting
for some conversion operations and do simplification for inner binary
operations on it.
Most important part is here the adjustment of constant integer-values
in statement-lists
for a truncation.
I limited that patch to handle in compare_equal_optimize_1 only
bitwise-and operations
inner a truncation-cast.  Of course for bitwise-xor/or operations some
more simplifications
are possible.
This patch just does the type-hoisting part.  In a second patch I add
to compare_equal_optimize_1
the ability for further required simplifications for fixing this problem.

ChangeLog

2012-03-15  Kai Tietz  

PR tree-optimization/45397
* tree-ssa-forwprop.c (compare_equal_optimize_1): New
function.
(combine_cond_expr_cond): Use compare_equal_optimize_1
function.
(truncate_integers): New function.
(combine_inner_conversion): Likewise.
(ssa_forward_propagate_and_combine): Use for conversions
combine_inner_conversion function.

2012-03-15  Kai Tietz  

* gcc.dg/tree-ssa/pr45397-1.c: New testcase.

Regression tested for all languages (including Ada and Obj-C) on
x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai

Index: gcc-trunk/gcc/tree-ssa-forwprop.c
===
--- gcc-trunk.orig/gcc/tree-ssa-forwprop.c
+++ gcc-trunk/gcc/tree-ssa-forwprop.c
@@ -362,6 +362,150 @@ rhs_to_tree (tree type, gimple stmt)
 gcc_unreachable ();
 }

+/* This function does simplifications of comparison OP0 ==/!= OP1
while integral
+   typed operands
+   On success new statement's TREE is returned, otherwise NULL_TREE.  */
+
+static tree
+compare_equal_optimize_1 (gimple stmt, enum tree_code code, tree type,
+ tree op0, tree op1)
+{
+  gimple_stmt_iterator gsi;
+  tree type_outer;
+  tree type_inner, op_inner;
+  tree op1_l, op1_r, tem;
+  gimple newop;
+
+  type_outer = TREE_TYPE (op0);
+  if ((code != EQ_EXPR && code != NE_EXPR)
+  || !INTEGRAL_TYPE_P (type_outer))
+return NULL_TREE;
+
+  /* If OP0 isn't a conversion, then check if OP1 might be one.  If so
+ swap arguments, otherwise return NULL_TREE.  */
+  if (!CONVERT_EXPR_P (op0))
+{
+  if (!CONVERT_EXPR_P (op1))
+return NULL_TREE;
+  tem = op0;
+  op0 = op1;
+  op1 = tem;
+}
+
+  op_inner = TREE_OPERAND (op0, 0);
+  type_inner = TREE_TYPE (op_inner);
+
+  /* Operations only apply to integral typed operands of cast.  */
+  if (!INTEGRAL_TYPE_P (type_inner))
+return NULL_TREE;
+
+  /* If second operand is also a type-conversion, check that underlying operand
+ is integral typed.  */
+  if (CONVERT_EXPR_P (op1)
+  && !INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op1, 0
+return NULL_TREE;
+
+  /* Simplify ((type) X ==/!= (type) X) to true/false, if X has no side-effects
+ and is integral typed.  */
+  if (CONVERT_EXPR_P (op1)
+  && TREE_OPERAND (op0, 0) == TREE_OPERAND (op1, 0)
+  && !TREE_SIDE_EFFECTS (TREE_OPERAND (op0, 0)))
+return fold_convert (type, (code == EQ_EXPR ? boolean_true_node
+   : boolean_false_node));
+
+  /* Simplify (type) X ==/!= (type) Y to X ==/!= Y, if types of X and Y are
+ compatible and type-precision of X is smaller or equal to TYPE's.  */
+  if (CONVERT_EXPR_P (op1)
+  && TYPE_PRECISION (type_inner) <= TYPE_PRECISION (type_outer))
+{
+  tem = TREE_OPERAND (op1, 0);
+  if (!useless_type_conversion_p (type_inner, TREE_TYPE (tem)))
+   return NULL_TREE;
+  return fold_build2_loc (gimple_location (stmt), code, type,
+ op_inner, tem);
+}
+
+  /* Verify that for pattern 'OP0 = (type) X' the type of X is of
integral kind,
+ is unsigned, and has smaller or equal precision to type TYPE.  */
+  if (TYPE_PRECISION (type_inner) > TYPE_PRECISION (type_outer)
+  || !TYPE_UNSIGNED (type_inner))
+return NULL_TREE;
+
+  /* Simplify (type) X ==/!= CST to X ==/!= CST' with CST'=(type-of-X) CST.  */
+  if (TREE_CODE (op1) == INTEGER_CST)
+{
+  tree new_cst = fold_convert (type_inner, op1);
+  /* If constant is out of range for (type) X, then return
+ constant result of comparison.  */
+  if (!operand_equal_p (op1, fold_convert (type_outer, new_cst), 0))
+   return fold_convert (type, (code == EQ_EXPR ? boolean_false_node
+   : boolean_true_node));
+  return fold_build2_loc (gimple_location (stmt), code, type,
op_inner, new_cst);
+}
+
+  /* Simplify (type) X ==/!= Y & CST to X ==/!= (type-of-X) (Y & CST).  If CST
+ is equal to ~(type-of-X)0, then simplify to X ==/!= (type-