Hi! If the shift count has enough known zero low bits (non-zero bits only above the ceil_log2 (precision)), then the only valid shift count that is not out of bounds is 0, so we can as well fold it into the first argument of the shift. This resolves a regression introduced by partly optimizing it at the gimple level, which results in it not being optimized at the RTL level that managed to do that completely.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-12-20 Jakub Jelinek <ja...@redhat.com> PR tree-optimization/71563 * match.pd: Simplify X << Y into X if Y is known to be 0 or out of range value - has low bits known to be zero. * gcc.dg/tree-ssa/pr71563.c: New test. --- gcc/match.pd.jj 2016-12-10 13:05:39.000000000 +0100 +++ gcc/match.pd 2016-12-20 15:44:30.892704283 +0100 @@ -1497,6 +1497,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (tem) (shiftrotate @0 { tem; })))))) +/* Simplify X << Y where Y's low width bits are 0 to X, as only valid + Y is 0. Similarly for X >> Y. */ +#if GIMPLE +(for shift (lshift rshift) + (simplify + (shift @0 @1) + (if (TREE_CODE (@1) == SSA_NAME && INTEGRAL_TYPE_P (TREE_TYPE (@1))) + (with { + int width = ceil_log2 (element_precision (TREE_TYPE (@0))); + int prec = TYPE_PRECISION (TREE_TYPE (@1)); + } + (if ((get_nonzero_bits (@1) & wi::mask (width, false, prec)) == 0) + @0))))) +#endif + /* Rewrite an LROTATE_EXPR by a constant into an RROTATE_EXPR by a new constant. */ (simplify --- gcc/testsuite/gcc.dg/tree-ssa/pr71563.c.jj 2016-12-20 15:57:16.624722177 +0100 +++ gcc/testsuite/gcc.dg/tree-ssa/pr71563.c 2016-12-20 15:57:01.000000000 +0100 @@ -0,0 +1,23 @@ +/* PR tree-optimization/71563 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +void link_error (void); + +void +foo (int k) +{ + int t = 1 << ((1 / k) << 8); + if (t != 1) + link_error (); +} + +void +bar (int k, int l) +{ + int t = l << (k << 8); + if (t != l) + link_error (); +} + +/* { dg-final { scan-tree-dump-not "link_error" "optimized" } } */ Jakub