https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118637
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I guess the first question is why do we optimize unsigned / 2 into unsigned >>
1 during GIMPLE opts only in VRP (evrp etc.) as can be seen with just compiling
it with -O1, even optimized dump has
x_9 = x_3(D) / 2;
and only expansion turns that into
(insn 13 12 0 (parallel [
(set (reg/v:SI 99 [ x ])
(lshiftrt:SI (reg/v:SI 101 [ x ])
(const_int 1 [0x1])))
(clobber (reg:CC 17 flags))
]) "pr118637.c":5:12 -1
(expr_list:REG_EQUAL (udiv:SI (reg/v:SI 101 [ x ])
(const_int 2 [0x2]))
(nil)))
So e.g. with -O1 we don't optimize
void link_error (void);
void
foo (unsigned x)
{
if (x / 2 != x >> 1)
link_error ();
}
until RTL.
The r8-2064-g8d1628eb33d4f5 optimization is sane IMHO; and user could have
written it that way as well.
Consider
unsigned
foo (unsigned x)
{
unsigned result = 0;
while (x /= 2)
++result;
return result;
}
unsigned
bar (unsigned x)
{
unsigned result = 0;
while (x >>= 1)
++result;
return result;
}
unsigned
baz (unsigned x)
{
unsigned result = 0, t;
while ((t = x, x /= 2, t > 1))
++result;
return result;
}
unsigned
qux (unsigned x)
{
unsigned result = 0, t;
while ((t = x, x >>= 1, t > 1))
++result;
return result;
}
unsigned
fred (unsigned x)
{
unsigned result = 0;
while (x > 1)
x /= 2, ++result;
return result;
}
unsigned
corge (unsigned x)
{
unsigned result = 0;
while (x > 1)
++result, x >>= 1;
return result;
}
I think all of these compute the same thing, but we only pattern recognize the
first one and clang only pattern recognizes the first two.