Hi Mike,

On Wed, Nov 08, 2017 at 08:14:31AM -0500, Michael Meissner wrote:
> PowerPC ISA 3.0 does not have a byte-reverse instruction that operates on the
> GPRs, but it does have vector byte swap half-word, word, double-word 
> operations
> in the VSX registers.  The enclosed patch enables generation of the byte
> revseral instructions for register-register operations.  It still prefers to
> generate the load with byte reverse (L{H,W,D}BRX) or store with byte reverse
> (ST{H,W,D}BRX) instructions over the register sequence.
> 
> For 16-bit and 32-bit byte swaps, it typically does the tradational operation
> in GPR registers, but it will generate XXBR{H,W} if the values are in vector
> registers.

You can do the byteswaps with just VMX, too, with some rotate instructions
(vrlh 8 for HI, vrlh 8 followed by vrlw 16 for SI, etc.)  For a future
date, I suppose.

> For 64-bit swaps, it no longer generates the 9 instruction sequence in favor 
> of
> XXBRD.  I did some timing runs on a prototype power9 system, and it was
> slightly faster to do direct move to the vecter unit, XXBRD, and direct move
> back to a GPR than the traditional sequence.
> 
> I did bootstraps on little endian Power8 and Power9 systems (with the default
> cpu set to power8 and power9 respectively).  There were no regressions.  Can I
> check this patch into the trunk?
> 
> [gcc]
> 2017-11-08  Michael Meissner  <meiss...@linux.vnet.ibm.com>
> 
>       * config/rs6000/rs6000.md (bswaphi2_reg): On ISA 3.0 systems,
>       enable generating XXBR{H,W} if the value is in a vector
>       register.

Join the last two lines?  And you only mean XXBRH here.

>       (bswapsi2_reg): Likewise.

And XXBRW here.

>       (bswapdi2_reg): On ISA 3.0 systems, use XXBRD to do bswap64
>       instead of doing the GPR sequence used on previoius machines.

Typo ("previous").

>       (bswapdi2_xxbrd): Likewise.

Not "Likewise"; just "New define_insn."

Should this somehow be joined with p9_xxbrd_<mode>?  Or maybe you should
generate that, instead.

>       (bswapdi2_reg splitters): Use int_reg_operand instead of
>       gpc_reg_operand to not match when XXBRD is generated.

Please see if you can merge the define_splits to their corresponding
define_insns (making define_insn_and_split).  Shorter, no mismatch
possible between the two anymore, and you get a name for the patterns
too :-)

> @@ -2507,6 +2513,8 @@ (define_expand "bswapdi2"
>       emit_insn (gen_bswapdi2_load (dest, src));
>        else if (MEM_P (dest))
>       emit_insn (gen_bswapdi2_store (dest, src));
> +      else if (TARGET_P9_VECTOR)
> +     emit_insn (gen_bswapdi2_xxbrd (dest, src));
>        else
>       emit_insn (gen_bswapdi2_reg (dest, src));
>        DONE;

Pity that you cannot easily do similar for HI and SI.

> @@ -2544,7 +2559,8 @@ (define_insn "bswapdi2_reg"
>     (clobber (match_scratch:DI 3 "=&r"))]
>    "TARGET_POWERPC64 && TARGET_LDBRX"
>    "#"
> -  [(set_attr "length" "36")])
> +  [(set_attr "length" "36")
> +   (set_attr "type" "*")])

Explicitly setting "type" (or any other attr) to the default value is
pretty useless; just leave it out.


Segher

Reply via email to