https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90106

--- Comment #9 from JunMa <JunMa at linux dot alibaba.com> ---
(In reply to JunMa from comment #7)
> yes, the transformation in CDEC prevent the tail call optimization. let's
> check the return stmt in CDEC pass.

Sorry for the confused comment. 

As the discussion above, The cdce pass looks for calls to built-in functions
that set errno and whose result is used. It tries to transform these calls into
conditionally executes calls with a simple range check on the arguments which
can detect most cases and the errno does not need to be set. The transform
looks like:

        y = sqrt (x);
     ==>
        y = IFN_SQRT (x);
        if (__builtin_isless (x, 0))
            sqrt (x);

However when the call is in tail position, this transformation breaks  tailcall
optimizations, since the conditionally call does not have return value. This is
what this PR tries to explain and fix.

Alexander gives two suggestions:
first:
        y = IFN_SQRT (x);
        if (__builtin_isless (x, 0))
            y = sqrt (x);

second(LLVM's approach):

        if (__builtin_isless (x, 0))
            y = sqrt (x);
        else
            y = IFN_SQRT (x);


So what I want to do here is looking for tailcall and transforming as first
one.

I did some hacks locally, but then I found gcc generated even worse code in 'y
= IFN_SQRT' part:

f:
pxor  %xmm1, %xmm1
movaps %xmm0, %xmm2
ucomiss %xmm0, %xmm1
sqrtss %xmm2, %xmm2
ja   .L4
movaps %xmm2, %xmm0
ret
.L4:
jmp  sqrtf

Then I used LLVM's approach no matter call is in tail position or not, and it
gives:

f:
  pxor  %xmm1, %xmm1
  ucomiss %xmm0, %xmm1
  ja   .L4
  sqrtss %xmm0, %xmm0
  ret
.L4:
  jmp  sqrtf 

Also in comment 6, I did some test for LLVM's approach.

Sorry for the confused comment again.

Reply via email to