On 29/07/14 15:49, Jiong Wang wrote: > this patch optimize copysign/copysignf for -mfloat-abi=soft on arm when BFI > instruction is available. > > before this patch, we do copysign (a, b) by three steps: > * fetch the sign bit of b to A > * fetch all non-sign bits of a to B > * or A and B > > while these three steps could be finished by one single BFI instruction. > > for example, for the following simple testcase: > > 1.c: > > float > fcopysign (float a, float b) > { > return __builtin_copysignf (a, b); > } > > before this patch > ================= > fcopysign: > and r1, r1, #-2147483648 > bic r0, r0, #-2147483648 > orr r0, r0, r1 > bx lr > > after this patch > =============== > fcopysign: > bfi r1, r0, #0, #31 > mov r0, r1 > bx lr > > at least one instruction could be saved. > > test done > === > no regression on the full toolchain test on arm-none-eabi. > > > ok to install?
Hmm, I think this is wrong for DF mode. The principle the patch works on is by tying the output to the value containing the sign bit, and then copying the rest of the other value into that value. However, for DF mode it only copies 31 of the 63 bits needed; the least significant 32 bits of the mantissa are not copied over. R. > > thanks. > > 2014-07-29 Jiong Wang <jiong.w...@arm.com> > 2014-07-29 Renlin Li <renlin...@arm.com> > > gcc/ > * config/arm/iterators.md (SFDF): New mode iterator. > (fp_high): New mode attribute. > * config/arm/unspecs.md (UNSPEC_BFM): New UNSPEC. > * config/arm/arm.md (copysign<mode>3): New define_insn for copysign. > > gcc/testsuite/ > * gcc.target/arm/copysign.c: New case for copysign/copysignf soft-float > opt. > > > opt-copysign.patch > > > commit 3b29b02cb4b525302179dde9e32528354040a2cb > Author: Jiong Wang <jiong.w...@arm.com> > Date: Tue Jul 15 17:09:21 2014 +0100 > > [ARM] Optimize copysign/copysignf. > > diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md > index 6ae240e..4244043 100644 > --- a/gcc/config/arm/arm.md > +++ b/gcc/config/arm/arm.md > @@ -11106,6 +11106,19 @@ > [(set_attr "predicable" "yes")] > ) > > +(define_insn "copysign<mode>3" > + [(set (match_operand:SFDF 0 "register_operand" "=r") > + (unspec:SFDF [(match_operand:SFDF 1 "register_operand" "r") > + (match_operand:SFDF 2 "register_operand" "0")] > + UNSPEC_BFM))] > + "TARGET_SOFT_FLOAT && arm_arch_thumb2" > + "bfi\t%<fp_high>0, %<fp_high>1, #0, #31" > + [(set_attr "length" "4") > + (set_attr "predicable" "yes") > + (set_attr "predicable_short_it" "no") > + (set_attr "type" "bfm")] > +) > + > ;; Vector bits common to IWMMXT and Neon > (include "vec-common.md") > ;; Load the Intel Wireless Multimedia Extension patterns > diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md > index 6fe6eef..d9de55a 100644 > --- a/gcc/config/arm/iterators.md > +++ b/gcc/config/arm/iterators.md > @@ -42,6 +42,9 @@ > ;; A list of the 32bit and 64bit integer modes > (define_mode_iterator SIDI [SI DI]) > > +;; A list of the 32bit and 64bit float modes > +(define_mode_iterator SFDF [SF DF]) > + > ;; A list of modes which the VFP unit can handle > (define_mode_iterator SDF [(SF "TARGET_VFP") (DF "TARGET_VFP_DOUBLE")]) > > @@ -497,6 +500,8 @@ > (DI "") (V2DI "_q") > (DF "") (V2DF "_q")]) > > +(define_mode_attr fp_high [(SF "") (DF "R")]) > + > > ;;---------------------------------------------------------------------------- > ;; Code attributes > > ;;---------------------------------------------------------------------------- > diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md > index 147cb80..94dc578 100644 > --- a/gcc/config/arm/unspecs.md > +++ b/gcc/config/arm/unspecs.md > @@ -62,6 +62,7 @@ > ; a given symbolic address. > UNSPEC_THUMB1_CASESI ; A Thumb1 compressed dispatch-table call. > UNSPEC_RBIT ; rbit operation. > + UNSPEC_BFM ; bfm operation. > UNSPEC_SYMBOL_OFFSET ; The offset of the start of the symbol from > ; another symbolic address. > UNSPEC_MEMORY_BARRIER ; Represent a memory barrier. > diff --git a/gcc/testsuite/gcc.target/arm/copysign.c > b/gcc/testsuite/gcc.target/arm/copysign.c > new file mode 100644 > index 0000000..7ec2068 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/copysign.c > @@ -0,0 +1,18 @@ > +/* { dg-do compile } */ > +/* { dg-skip-if "skip override" { *-*-* } { "-mfloat-abi=softfp" > "-mfloat-abi=hard" } { "" } } */ > +/* { dg-options "-O2 -march=armv7-a -mfloat-abi=soft" } */ > + > +float > +fcopysign (float a, float b) > +{ > + return __builtin_copysignf (a, b); > +} > + > +double > +dcopysign (double a, double b) > +{ > + return __builtin_copysign (a, b); > +} > + > +/* { dg-final { scan-assembler-times "bfi" 2 } } */ > +/* { dg-final { scan-assembler-not "orr" } } */ >