These patches are a refinement of the patches to add XXSPLTIDP support on September 13th. These patches generate instructions that load up a VSX register with certain constants instead of using PLXV to load the constant.
On the Power10: * XXSPLTIDP is a prefixed instruction that takes a value encoded as a SFmode constant, converts it to DFmode, and splats that value to the two 64-bit parts of the register. * XXSPLTIW is a prefixed instruction that takes a 32-bit value and splats this value into the 4 32-bit parts of the vector register, i.e. it can be used to generate V4SImode and V4SFmode vector constants where all of the elements are the same. * XXSPLTI32DX is a prefixed instruction that takes a 32-bit value and splats this value into either the 2 even 32-bit parts of the vector register or 2 odd 32-bit parts. Thus 2 XXSPLTI32DX instructions can generate a 64-bit constant that cannot be generated by XXSPLTIDP. Note, in the current set of patches, I do not add support for XXSPLTI32DX. I have done so in previous patches, and I could add it if desired. Because it is 2 back-to-back prefixed instructions that are serially dependent on each other, I don't think it is worthwhile to use XXSPLTI32DX. * LXVKQ is a non-prefixed instruction that loads up certain 128-bit values the match particular IEEE 128-bit constants (-0.0f128, 1.0f128, 2.0f128, etc.). There are 5 patches in this set. One of the takeaways from the last review was it would be desirable to generate the instruction if it generates a value that matches the vector constant, even if the vector type is not the native vector type for the instruction. For example, the following code: vector unsigned long long foo (void) { #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ return (vector unsigned long long) { 0, 1ULL << 63 }; #else return (vector unsigned long long) { 1ULL << 63, 0 }; #endif } should generate: foo: lxvkq 34,16 blr To that end, I added support to create a data structure that takes a vector or scalar constant and represents it as a series of bytes, half-words, words, and double-words. Then the recognizer functions use this data structure to decide if a given instruction can be generated. This way functions like easy_vector_constant can avoid repeatedly taking a vector constant and converting it into internal format before trying to decide if a given instruction can be generated. For example, this is the part in easy_vector_constant that determines if a vector constant can generate LXVKQ, XXSPLTIDP, or XXSPLTIW: /* Constants that can be generated with ISA 3.1 instructions are easy. */ vec_const_128bit_type vsx_const; if (TARGET_POWER10 && vec_const_128bit_to_bytes (op, mode, &vsx_const)) { if (constant_generates_lxvkq (&vsx_const)) return true; if (constant_generates_xxspltiw (&vsx_const)) return true; if (constant_generates_xxspltidp (&vsx_const)) return true; } In theory, a lot of the altivec constant functions could be converted to use this data structure, but I haven't rewritten those instructions. The 5 patches are: 1) Add the data structure and function converting vector/scalar constants to that data structure. Note, this function is not used in the current patch, but the remaining 4 patches depend on it. 2) Add support to recognize when we could generate the LXVKQ instruction. 3) Add support to recognize when we could generate the XXSPLTIW instruction. 4) Add support to recognize when we could generate the XXSPLTIDP instruction for vector constants. 5) Add support to recognize when we could generate the XXSPLTIDP instruction for SFmode and DFmode constants. I have built these patches on power9 and power10 little endian systems with no regressions in the current tests. I am kicking off a build on a power8 big endian system as I write this post. I have run previous versions of the patch on the big endian system without problems. I would like to check this into the GCC 12 trunk branch. At the moment, I am not asking to be able to back-port the patches to GCC 11, but we can do this if it is deemed desirable. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com