>> The new version does not seem better, as it adds a branch on the path
>> and it is not smaller.
>
> That looks like bb-reorder isn't doing its job?  Maybe it thinks that
> pop is too expensive to copy?

It relies on static branch probabilities, which are set completely wrong in GCC,
so it ends up optimizing the hot path in many functions for size rather than 
speed
and visa versa. A simple example I tried on AArch64:

void g(void);
int a;
int f(void)
{
  g();       
  if (a == 0)  // or != 0 or < 0 or a < 0x7ffffffe
    return -1;
  a = 1;
  return 1;
}

The funny thing is that a == 0 and a != 0 behave in exactly the same way, but a 
< 0 and
a >= 0 are different. However a < C and a > C are always seen as unlikely no 
matter the
immediate, except for a > 0 which inlines the return... 

I also noticed that GCC ignores the explicit __builtin_expect used in the 
string/str(c)spn.c
implementations in GLIBC (which you need to avoid incorrect block ordering) and 
not only
inlines returns in the cold path but also fails to inline them in the hot 
path...

Wilco


Reply via email to