This patch improves the latency of code by eliminating two FP <-> integer 
register transfers.

An example:

float f1(float x)
{
  int y = x;
  return (float)y;
}

Trunk generates:

f1:
        fcvtzs  w0, s0
        scvtf   s0, w0
        ret

With the patch we can use the neon scalar instructions and eliminate the two FP 
<-> integer register transfes.

f1:
        fcvtzs  s0, s0
        scvtf   s0, s0
        ret

Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk?

2017-09-02  Michael Collison  <michael.colli...@arm.com>

        * config/aarch64/aarch64.md(<optab>_trunc><vf><GPI:mode>2):
        New pattern.
        (<optab>_trunchf<GPI:mode>2: New pattern.
        (<optab>_trunc<vgp><GPI:mode>2: New pattern.
        * config/aarch64/iterators.md (wv): New mode attribute.
        (vf, VF): New mode attributes.
        (vgp, VGP): New mode attributes.
        (s): Update attribute with SImode and DImode prefixes.
        * testsuite/gcc.target/aarch64/fix_trunc1.c: New testcase.

Attachment: pr6527.patch
Description: pr6527.patch

Reply via email to