https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68695

--- Comment #22 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Going back to variants of the original testcase:
int
foo (int x, int y, int a)
{
  int i = x;
  int j = y;
#ifdef EX1
  if (__builtin_expect (x > y, 1))
#elif defined EX0
  if (__builtin_expect (x > y, 0))
#else
  if (x > y)
#endif
    {
      i = a;
      j = i;
    }
  return i * j;
}

at least for the -DEX1 case I'd think it is reasonable to assign one of i or j
to the register holding argument a (%r4), because that will for the common case
need one fewer register move.  But, there is one further constraint.  The
multiplication wants the result to live in %r2, because then it can avoid a
move, and the multiplication is commutative two operand one, so one of the
operands needs to match the output.  Thus, from this the reasonable disposition
choices are either i in %r2 and j in %r3 (this is especially desirable if the
if is unlikely, i.e. -DEX0 case), or perhaps i in %r2 and j in %r4 (while this
will need if/then/else rather than if/then only, it would use one fewer move in
the expected likely block).
The problem is that IRA chooses i in %r4 and j in %r3, so there are 2 moves
even in the fast path (the i = a assignment is a nop, but j = i is needed, and
then later on extra reload move is added, because pseudo 67 disposition (result
of mulsi3) is properly in %r2 and if one argument is %r4 and another one %r3,
it needs to copy one to %r2.

Vlad, can you please have a look?

Reply via email to