This patch improves the latency of code by eliminating two FP <-> integer register transfers.
An example:
float f1(float x)
{
int y = x;
return (float)y;
}
Trunk generates:
f1:
fcvtzs w0, s0
scvtf s0, w0
ret
With the patch we can use the neon scalar instructions and eliminate the two FP
<-> integer register transfes.
f1:
fcvtzs s0, s0
scvtf s0, s0
ret
Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk?
2017-09-02 Michael Collison <[email protected]>
* config/aarch64/aarch64.md(<optab>_trunc><vf><GPI:mode>2):
New pattern.
(<optab>_trunchf<GPI:mode>2: New pattern.
(<optab>_trunc<vgp><GPI:mode>2: New pattern.
* config/aarch64/iterators.md (wv): New mode attribute.
(vf, VF): New mode attributes.
(vgp, VGP): New mode attributes.
(s): Update attribute with SImode and DImode prefixes.
* testsuite/gcc.target/aarch64/fix_trunc1.c: New testcase.
pr6527.patch
Description: pr6527.patch
