On 09.11.20 20:54, Ilya Leoshkevich wrote: > On z14+, there are instructions for working with 128-bit floats (long > doubles) in vector registers. It's beneficial to use them instead of > instructions that operate on floating point register pairs, because it > allows to store 4 times more data in registers at a time, relieving > register pressure. The raw performance of the new instructions is > almost the same as that of the new ones. > > Implement by storing TFmode values in vector registers on z14+. Since > not all operations are available with the new instructions, keep the > old ones available using the new FPRX2 mode, and convert between it and > TFmode when necessary (this is called "forwarder" expanders below). > Change the existing TFmode expanders to call either new- or old-style > ones depending on whether we are on z14+ or older machines > ("dispatcher" expanders). > > gcc/ChangeLog: > > 2020-11-03 Ilya Leoshkevich <i...@linux.ibm.com> > > * config/s390/s390-modes.def (FPRX2): New mode. > * config/s390/s390-protos.h (s390_fma_allowed_p): New function. > * config/s390/s390.c (s390_fma_allowed_p): Likewise. > (s390_build_signbit_mask): Support 128-bit masks. > (print_operand): Support printing the second word of a TFmode > operand as vector register. > (constant_modes): Add FPRX2mode. > (s390_class_max_nregs): Return 1 for TFmode on z14+. > (s390_is_fpr128): New function. > (s390_is_vr128): Likewise. > (s390_can_change_mode_class): Use s390_is_fpr128 and > s390_is_vr128 in order to determine whether mode refers to a FPR > pair or to a VR. > (s390_emit_compare): Force TFmode operands into registers on > z14+. > * config/s390/s390.h (HAVE_TF): New macro. > (EXPAND_MOVTF): New macro. > (EXPAND_TF): Likewise. > * config/s390/s390.md (PFPO_OP_TYPE_FPRX2): PFPO_OP_TYPE_TF > alias. > (ALL): Add FPRX2. > (FP_ALL): Add FPRX2 for z14+, restrict TFmode to z13-. > (FP): Likewise. > (FP_ANYTF): New mode iterator. > (BFP): Add FPRX2 for z14+, restrict TFmode to z13-. > (TD_TF): Likewise. > (xde): Add FPRX2. > (nBFP): Likewise. > (nDFP): Likewise. > (DSF): Likewise. > (DFDI): Likewise. > (SFSI): Likewise. > (DF): Likewise. > (SF): Likewise. > (fT0): Likewise. > (bt): Likewise. > (_d): Likewise. > (HALF_TMODE): Likewise. > (tf_fpr): New mode_attr. > (type): New mode_attr. > (*cmp<mode>_ccz_0): Use type instead of mode with fsimp. > (*cmp<mode>_ccs_0_fastmath): Likewise. > (*cmptf_ccs): New pattern for wfcxb. > (*cmptf_ccsfps): New pattern for wfkxb. > (mov<mode>): Rename to mov<mode><tf_fpr>. > (signbit<mode>2): Rename to signbit<mode>2<tf_fpr>. > (isinf<mode>2): Renamed to isinf<mode>2<tf_fpr>. > (*TDC_insn_<mode>): Use type instead of mode with fsimp. > (fixuns_trunc<FP:mode><GPR:mode>2): Rename to > fixuns_trunc<FP:mode><GPR:mode>2<FP:tf_fpr>. > (fix_trunctf<mode>2): Rename to fix_trunctf<mode>2_fpr. > (floatdi<mode>2): Rename to floatdi<mode>2<tf_fpr>, use type > instead of mode with itof. > (floatsi<mode>2): Rename to floatsi<mode>2<tf_fpr>, use type > instead of mode with itof. > (*floatuns<GPR:mode><FP:mode>2): Use type instead of mode for > itof. > (floatuns<GPR:mode><FP:mode>2): Rename to > floatuns<GPR:mode><FP:mode>2<tf_fpr>. > (trunctf<mode>2): Rename to trunctf<mode>2_fpr, use type instead > of mode with fsimp. > (extend<DSF:mode><BFP:mode>2): Rename to > extend<DSF:mode><BFP:mode>2<BFP:tf_fpr>. > (<FPINT:fpint_name><BFP:mode>2): Rename to > <FPINT:fpint_name><BFP:mode>2<BFP:tf_fpr>, use type instead of > mode with fsimp. > (rint<BFP:mode>2): Rename to rint<BFP:mode>2<BFP:tf_fpr>, use > type instead of mode with fsimp. > (<FPINT:fpint_name><DFP:mode>2): Use type instead of mode for > fsimp. > (rint<DFP:mode>2): Likewise. > (trunc<BFP:mode><DFP_ALL:mode>2): Rename to > trunc<BFP:mode><DFP_ALL:mode>2<BFP:tf_fpr>. > (trunc<DFP_ALL:mode><BFP:mode>2): Rename to > trunc<DFP_ALL:mode><BFP:mode>2<BFP:tf_fpr>. > (extend<BFP:mode><DFP_ALL:mode>2): Rename to > extend<BFP:mode><DFP_ALL:mode>2<BFP:tf_fpr>. > (extend<DFP_ALL:mode><BFP:mode>2): Rename to > extend<DFP_ALL:mode><BFP:mode>2<BFP:tf_fpr>. > (add<mode>3): Rename to add<mode>3<tf_fpr>, use type instead of > mode with fsimp. > (*add<mode>3_cc): Use type instead of mode with fsimp. > (*add<mode>3_cconly): Likewise. > (sub<mode>3): Rename to sub<mode>3<tf_fpr>, use type instead of > mode with fsimp. > (*sub<mode>3_cc): Use type instead of mode with fsimp. > (*sub<mode>3_cconly): Likewise. > (mul<mode>3): Rename to mul<mode>3<tf_fpr>, use type instead of > mode with fsimp. > (fma<mode>4): Restrict using s390_fma_allowed_p. > (fms<mode>4): Restrict using s390_fma_allowed_p. > (div<mode>3): Rename to div<mode>3<tf_fpr>, use type instead of > mode with fdiv. > (neg<mode>2): Rename to neg<mode>2<tf_fpr>. > (*neg<mode>2_cc): Use type instead of mode with fsimp. > (*neg<mode>2_cconly): Likewise. > (*neg<mode>2_nocc): Likewise. > (*neg<mode>2): Likeiwse. > (abs<mode>2): Rename to abs<mode>2<tf_fpr>, use type instead of > mode with fdiv. > (*abs<mode>2_cc): Use type instead of mode with fsimp. > (*abs<mode>2_cconly): Likewise. > (*abs<mode>2_nocc): Likewise. > (*abs<mode>2): Likewise. > (*negabs<mode>2_cc): Likewise. > (*negabs<mode>2_cconly): Likewise. > (*negabs<mode>2_nocc): Likewise. > (*negabs<mode>2): Likewise. > (sqrt<mode>2): Rename to sqrt<mode>2<tf_fpr>, use type instead > of mode with fsqrt. > (cbranch<mode>4): Use FP_ANYTF instead of FP. > (copysign<mode>3): Rename to copysign<mode>3<tf_fpr>, use type > instead of mode with fsimp. > * config/s390/s390.opt (flag_vx_long_double_fma): New > undocumented option. > * config/s390/vector.md (V_HW): Add TF for z14+. > (V_HW2): Likewise. > (VFT): Likewise. > (VF_HW): Likewise. > (V_128): Likewise. > (tf_vr): New mode_attr. > (tointvec): Add TF. > (mov<mode>): Rename to mov<mode><tf_vr>. > (movetf): New dispatcher. > (*vec_tf_to_v1tf): Rename to *vec_tf_to_v1tf_fpr, restrict to > z13-. > (*vec_tf_to_v1tf_vr): New pattern for z14+. > (*fprx2_to_tf): Likewise. > (*mov_tf_to_fprx2_0): Likewise. > (*mov_tf_to_fprx2_1): Likewise. > (add<mode>3): Rename to add<mode>3<tf_vr>. > (addtf3): New dispatcher. > (sub<mode>3): Rename to sub<mode>3<tf_vr>. > (subtf3): New dispatcher. > (mul<mode>3): Rename to mul<mode>3<tf_vr>. > (multf3): New dispatcher. > (div<mode>3): Rename to div<mode>3<tf_vr>. > (divtf3): New dispatcher. > (sqrt<mode>2): Rename to sqrt<mode>2<tf_vr>. > (sqrttf2): New dispatcher. > (fma<mode>4): Restrict using s390_fma_allowed_p. > (fms<mode>4): Likewise. > (neg_fma<mode>4): Likewise. > (neg_fms<mode>4): Likewise. > (neg<mode>2): Rename to neg<mode>2<tf_vr>. > (negtf2): New dispatcher. > (abs<mode>2): Rename to abs<mode>2<tf_vr>. > (abstf2): New dispatcher. > (float<mode>tf2_vr): New forwarder. > (float<mode>tf2): New dispatcher. > (floatuns<mode>tf2_vr): New forwarder. > (floatuns<mode>tf2): New dispatcher. > (fix_trunctf<mode>2_vr): New forwarder. > (fix_trunctf<mode>2): New dispatcher. > (fixuns_trunctf<mode>2_vr): New forwarder. > (fixuns_trunctf<mode>2): New dispatcher. > (<FPINT:fpint_name><VF_HW:mode>2<VF_HW:tf_vr>): New pattern. > (<FPINT:fpint_name>tf2): New forwarder. > (rint<mode>2<tf_vr>): New pattern. > (rinttf2): New forwarder. > (*trunctfdf2_vr): New pattern. > (trunctfdf2_vr): New forwarder. > (trunctfdf2): New dispatcher. > (trunctfsf2_vr): New forwarder. > (trunctfsf2): New dispatcher. > (extenddftf2_vr): New pattern. > (extenddftf2): New dispatcher. > (extendsftf2_vr): New forwarder. > (extendsftf2): New dispatcher. > (signbittf2_vr): New forwarder. > (signbittf2): New dispatchers. > (isinftf2_vr): New forwarder. > (isinftf2): New dispatcher. > * config/s390/vx-builtins.md (*vftci<mode>_cconly): Use VF_HW > instead of VECF_HW, add missing constraint, add vw support. > (vftci<mode>_intcconly): Use VF_HW instead of VECF_HW. > (*vftci<mode>): Rename to vftci<mode>, use VF_HW instead of > VECF_HW, and vw support. > (vftci<mode>_intcc): Use VF_HW instead of VECF_HW.
Ok. Thanks! Andreas