Kewen: On 6/4/24 00:19, Kewen.Lin wrote: > Hi, > > on 2024/5/29 23:58, Carl Love wrote: >> Updated the patch per the feedback comments from the previous version. >> >> Carl >> ------------------------------------------------------- >> >> rs6000, extend the current vec_{un,}signed{e,o} built-ins >> >> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds >> convert a vector of floats to signed/unsigned long long ints. Extend the >> existing vec_{un,}signed{e,o} built-ins to handle the argument >> vector of floats to return the even/odd signed/unsigned integers. >> >> The define expands vsignede_v4sf, vsignedo_v4sf, vunsignede_v4sf, >> vunsignedo_v4sf are added to support the new vec_{un,}signed{e,o} >> built-ins. >> >> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds are >> now for internal use only. They are not documented and they do not >> have testcases. >>> The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by >> vec_signed{e,o}, remove. >> >> The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by >> vec_unsigned{e,o}, remove. >> >> The built-in __builtin_vsx_xvcvdpuxds_uns is redundant as it is covered by >> vec_unsigned, remove. >> >> The __builtin_vsx_xvcvspuxws is redundante as it is covered by >> vec_unsigned, remove. > > I perfer to move these removals into sub-patch 2/13 or split them out into > a new patch, since they don't match the subject of this patch. Moving it > to sub-patch 2/13 looks good as they are all about vec_{un,}signed{,e,o}.
Yes, we need to have all of the vec_unsigned in the same patch. Moved __builtin_vsx_xvcvdpuxds_uns and __builtin_vsx_xvcvspuxws to patch 2. > >> >> Add testcases and update documentation. >> >> gcc/ChangeLog: >> * config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxds_low, >> __builtin_vsx_xvcvspuxds_low): New built-in definitions. >> (__builtin_vsx_xvcvspuxds): Fix return type. >> (XVCVSPSXDS, XVCVSPUXDS): Renamed VEC_VSIGNEDE_V4SF, >> VEC_VUNSIGNEDE_V4SF respectively. >> (vsx_xvcvspsxds, vsx_xvcvspuxds): Renamed vsignede_v4sf, >> vunsignede_v4sf respectively. >> (__builtin_vsx_xvcvdpsxws, __builtin_vsx_xvcvdpuxws, >> __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws): Removed. >> * config/rs6000/rs6000-overload.def (vec_signede, vec_signedo, >> vec_unsignede,vec_unsignedo): Add new overloaded specifications. >> * config/rs6000/vsx.md (vsignede_v4sf, vsignedo_v4sf, >> vunsignede_v4sf, vunsignedo_v4sf): New define_expands. >> * doc/extend.texi (vec_signedo, vec_signede): Add documentation. >> >> gcc/testsuite/ChangeLog: >> * gcc.target/powerpc/builtins-3-runnable.c: New tests for the added >> overloaded built-ins. >> --- >> gcc/config/rs6000/rs6000-builtins.def | 25 ++---- >> gcc/config/rs6000/rs6000-overload.def | 8 ++ >> gcc/config/rs6000/vsx.md | 88 +++++++++++++++++++ >> gcc/doc/extend.texi | 10 +++ >> .../gcc.target/powerpc/builtins-3-runnable.c | 51 +++++++++-- >> 5 files changed, 157 insertions(+), 25 deletions(-) >> >> diff --git a/gcc/config/rs6000/rs6000-builtins.def >> b/gcc/config/rs6000/rs6000-builtins.def >> index bf9a0ae22fc..cea2649b86c 100644 >> --- a/gcc/config/rs6000/rs6000-builtins.def >> +++ b/gcc/config/rs6000/rs6000-builtins.def >> @@ -1688,32 +1688,23 @@ >> const vsll __builtin_vsx_xvcvdpsxds_scale (vd, const int); >> XVCVDPSXDS_SCALE vsx_xvcvdpsxds_scale {} >> >> - const vsi __builtin_vsx_xvcvdpsxws (vd); >> - XVCVDPSXWS vsx_xvcvdpsxws {} >> - >> - const vsll __builtin_vsx_xvcvdpuxds (vd); >> - XVCVDPUXDS vsx_fixuns_truncv2dfv2di2 {} >> - >> const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int); >> XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {} >> >> - const vull __builtin_vsx_xvcvdpuxds_uns (vd); >> - XVCVDPUXDS_UNS vsx_fixuns_truncv2dfv2di2 {} >> - >> - const vsi __builtin_vsx_xvcvdpuxws (vd); >> - XVCVDPUXWS vsx_xvcvdpuxws {} >> - >> const vd __builtin_vsx_xvcvspdp (vf); >> XVCVSPDP vsx_xvcvspdp {} >> >> const vsll __builtin_vsx_xvcvspsxds (vf); >> - XVCVSPSXDS vsx_xvcvspsxds {} >> + VEC_VSIGNEDE_V4SF vsignede_v4sf {} > > We should rename __builtin_vsx_xvcvspsxds to > __builtin_vsx_vsignede_v4sf, one reason is to align with > the existing others, one more important thing > is that it doesn't generate 1-1 mapping xvcvspsxds, > putting that mnemonic can be misleading. Yes, that would be more consistent. Changed. > >> + >> + const vsll __builtin_vsx_xvcvspsxds_low (vf); > > Ditto. Changed. > >> + VEC_VSIGNEDO_V4SF vsignedo_v4sf {} >> >> - const vsll __builtin_vsx_xvcvspuxds (vf); - XVCVSPUXDS vsx_xvcvspuxds >> {} >> + const vull __builtin_vsx_xvcvspuxds (vf); > > Ditto. Changed. > >> + VEC_VUNSIGNEDE_V4SF vunsignede_v4sf {} >> >> - const vsi __builtin_vsx_xvcvspuxws (vf); >> - XVCVSPUXWS vsx_fixuns_truncv4sfv4si2 {} >> + const vull __builtin_vsx_xvcvspuxds_low (vf); > > Ditto. Changed. > >> + VEC_VUNSIGNEDO_V4SF vunsignedo_v4sf {} >> >> const vd __builtin_vsx_xvcvsxddp (vsll); >> XVCVSXDDP vsx_floatv2div2df2 {} >> diff --git a/gcc/config/rs6000/rs6000-overload.def >> b/gcc/config/rs6000/rs6000-overload.def >> index 84bd9ae6554..4d857bb1af3 100644 >> --- a/gcc/config/rs6000/rs6000-overload.def >> +++ b/gcc/config/rs6000/rs6000-overload.def >> @@ -3307,10 +3307,14 @@ >> [VEC_SIGNEDE, vec_signede, __builtin_vec_vsignede] >> vsi __builtin_vec_vsignede (vd); >> VEC_VSIGNEDE_V2DF >> + vsll __builtin_vec_vsignede (vf); >> + VEC_VSIGNEDE_V4SF >> >> [VEC_SIGNEDO, vec_signedo, __builtin_vec_vsignedo] >> vsi __builtin_vec_vsignedo (vd); >> VEC_VSIGNEDO_V2DF >> + vsll __builtin_vec_vsignedo (vf); >> + VEC_VSIGNEDO_V4SF >> >> [VEC_SIGNEXTI, vec_signexti, __builtin_vec_signexti] >> vsi __builtin_vec_signexti (vsc); >> @@ -4433,10 +4437,14 @@ >> [VEC_UNSIGNEDE, vec_unsignede, __builtin_vec_vunsignede] >> vui __builtin_vec_vunsignede (vd); >> VEC_VUNSIGNEDE_V2DF >> + vull __builtin_vec_vunsignede (vf); >> + VEC_VUNSIGNEDE_V4SF >> >> [VEC_UNSIGNEDO, vec_unsignedo, __builtin_vec_vunsignedo] >> vui __builtin_vec_vunsignedo (vd); >> VEC_VUNSIGNEDO_V2DF >> + vull __builtin_vec_vunsignedo (vf); >> + VEC_VUNSIGNEDO_V4SF >> >> [VEC_VEE, vec_extract_exp, __builtin_vec_extract_exp] >> vui __builtin_vec_extract_exp (vf); >> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md >> index f135fa079bd..a8f3d459232 100644 >> --- a/gcc/config/rs6000/vsx.md >> +++ b/gcc/config/rs6000/vsx.md >> @@ -2704,6 +2704,94 @@ (define_expand "vsx_xvcvsp<su>xds" >> DONE; >> }) >> >> +;; Convert low vector elements of 32-bit floating point numbers to vector of >> +;; 64-bit signed > > Maybe: > > ;; Convert float vector even elements to {un,}signed long long vector Changed the four comments to the suggested pattern. > >> +(define_expand "vsignede_v4sf" >> + [(match_operand:V2DI 0 "vsx_register_operand") >> + (match_operand:V4SF 1 "vsx_register_operand")] >> + "VECTOR_UNIT_VSX_P (V2DFmode)" >> +{ >> + if (BYTES_BIG_ENDIAN) >> + { >> + /* Shift left one word to put even word in correct location */ >> + rtx rtx_tmp = gen_reg_rtx (V4SFmode); >> + rtx rtx_val = GEN_INT (4); >> + emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], >> operands[1], >> + rtx_val)); >> + emit_insn (gen_vsx_xvcvspsxds_be (operands[0], rtx_tmp)); >> + } > > I think this is wrong, even elements on BE is word 0 and 2, it doesn't > requires vector shifting (similar to doublee<mode>2), while LE needs. OK, went thru this again, used gdb to look at how things get laied out in the registers. I agree it loks like I have the shifting backwards for LE/BE on the even/odd stuff. Fixing this requires fixing the expected results in the corresponding test case as they are backwards. > >> + else >> + emit_insn (gen_vsx_xvcvspsxds_le (operands[0], operands[1])); >> + >> + DONE; >> +}) >> + >> +;; Convert high vector elements of 32-bit floating point numbers to vector >> of >> +;; 64-bit signed > > Ditto. Changed > >> +(define_expand "vsignedo_v4sf" >> + [(match_operand:V2DI 0 "vsx_register_operand") >> + (match_operand:V4SF 1 "vsx_register_operand")] >> + "VECTOR_UNIT_VSX_P (V2DFmode)" >> +{ >> + if (BYTES_BIG_ENDIAN) >> + emit_insn (gen_vsx_xvcvspsxds_be (operands[0], operands[1])); > > As above, this is for odd elements, so BE needs vector shifting while LE > doesn't. Chnaged this file and the corresponding expected test case results. > > The vunsigned* below need the according fixes. > >> + else >> + { >> + /* Shift left one word to put even word in correct location */ >> + rtx rtx_tmp = gen_reg_rtx (V4SFmode); >> + rtx rtx_val = GEN_INT (4); >> + emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1], >> + rtx_val)); >> + emit_insn (gen_vsx_xvcvspsxds_le (operands[0], rtx_tmp)); >> + } >> + >> + DONE; >> +}) >> + >> +;; Convert low vector elements of 32-bit floating point numbers to vector of >> +;; 64-bit unsigned integers. Changed comment as suggested above. >> +(define_expand "vunsignede_v4sf" >> + [(match_operand:V2DI 0 "vsx_register_operand") >> + (match_operand:V4SF 1 "vsx_register_operand")] >> + "VECTOR_UNIT_VSX_P (V2DFmode)" >> +{ >> + if (BYTES_BIG_ENDIAN) >> + { >> + /* Shift left one word to put even word in correct location */ >> + rtx rtx_tmp = gen_reg_rtx (V4SFmode); >> + rtx rtx_val = GEN_INT (4); >> + emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1], >> + rtx_val)); >> + emit_insn (gen_vsx_xvcvspuxds_be (operands[0], rtx_tmp)); >> + } >> + else >> + emit_insn (gen_vsx_xvcvspuxds_le (operands[0], operands[1])); >> + >> + DONE; >> +}) >> + >> +;; Convert high vector elements of 32-bit floating point numbers to vector >> of >> +;; 64-bit unsigned integers. Changed comment as suggested above. >> +(define_expand "vunsignedo_v4sf" >> + [(match_operand:V2DI 0 "vsx_register_operand") >> + (match_operand:V4SF 1 "vsx_register_operand")] >> + "VECTOR_UNIT_VSX_P (V2DFmode)" >> +{ >> + if (BYTES_BIG_ENDIAN) >> + emit_insn (gen_vsx_xvcvspuxds_be (operands[0], operands[1])); >> + else >> + { >> + /* Shift left one word to put even word in correct location */ >> + rtx rtx_tmp = gen_reg_rtx (V4SFmode); >> + rtx rtx_val = GEN_INT (4); >> + emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1], >> + rtx_val)); >> + emit_insn (gen_vsx_xvcvspuxds_le (operands[0], rtx_tmp)); >> + } >> + >> + DONE; >> +}) >> + >> ;; Generate float2 double >> ;; convert two double to float >> (define_expand "float2_v2df" >> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi >> index 267fccd1512..b88e61641a2 100644 >> --- a/gcc/doc/extend.texi >> +++ b/gcc/doc/extend.texi >> @@ -22577,6 +22577,16 @@ if the VSX instruction set is available. The >> @samp{vec_vsx_ld} and >> @samp{vec_vsx_st} built-in functions always generate the VSX @samp{LXVD2X}, >> @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions. >> >> +@smallexample >> +vector signed signed long long vec_signedo (vector float); >> +vector signed signed long long vec_signede (vector float); >> +vector unsigned signed long long vec_signedo (vector float); >> +vector unsigned signed long long vec_signede (vector float); >> +@end smallexample > > Nit: s/signed long/long/ Yea, a little verbose there... :-) Fixed. > > BR, > Kewen > >> + >> +The overloaded built-ins @code{vec_signedo} and @code{vec_signede} are >> +additional extensions to the built-ins as documented in the PVIPR. >> + >> @node PowerPC AltiVec Built-in Functions Available on ISA 2.07 >> @subsubsection PowerPC AltiVec Built-in Functions Available on ISA 2.07 >> >> diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c >> b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c >> index 5dcdfbee791..557befc9a4a 100644 >> --- a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c >> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c >> @@ -3,7 +3,7 @@ >> /* { dg-options "-maltivec -mvsx" } */ >> >> #include <altivec.h> // vector >> - >> +#define DEBUG 1 >> #ifdef DEBUG >> #include <stdio.h> >> #endif >> @@ -81,14 +81,15 @@ void test_unsigned_int_result(int check, vector unsigned >> int vec_result, >> } >> >> void test_ll_int_result(vector long long int vec_result, >> - vector long long int vec_expected) >> + vector long long int vec_expected, >> + char *string) >> { >> int i; >> >> for (i = 0; i < 2; i++) >> if (vec_result[i] != vec_expected[i]) { >> #ifdef DEBUG >> - printf("Test_ll_int_result: "); >> + printf("Test_ll_int_result %s: ", string); >> printf("vec_result[%d] (%lld) != vec_expected[%d] >> (%lld)\n", >> i, vec_result[i], i, vec_expected[i]); >> #else >> @@ -98,14 +99,15 @@ void test_ll_int_result(vector long long int vec_result, >> } >> >> void test_ll_unsigned_int_result(vector long long unsigned int vec_result, >> - vector long long unsigned int vec_expected) >> + vector long long unsigned int vec_expected, >> + char *string) >> { >> int i; >> >> for (i = 0; i < 2; i++) >> if (vec_result[i] != vec_expected[i]) { >> #ifdef DEBUG >> - printf("Test_ll_unsigned_int_result: "); >> + printf("Test_ll_unsigned_int_result %s: ", string); >> printf("vec_result[%d] (%lld) != vec_expected[%d] >> (%lld)\n", >> i, vec_result[i], i, vec_expected[i]); >> #else >> @@ -292,7 +294,8 @@ int main() >> vec_dble0 = (vector double){-124.930, 81234.49}; >> vec_ll_int_expected = (vector long long signed int){-124, 81234}; >> vec_ll_int_result = vec_signed (vec_dble0); >> - test_ll_int_result (vec_ll_int_result, vec_ll_int_expected); >> + test_ll_int_result (vec_ll_int_result, vec_ll_int_expected, >> + "vec_signed"); >> >> /* Convert double precision vector float to vector int, even words */ >> vec_dble0 = (vector double){-124.930, 81234.49}; >> @@ -321,12 +324,44 @@ int main() >> test_unsigned_int_result (ALL, vec_uns_int_result, >> vec_uns_int_expected); >> >> + /* Convert single precision vector float, even args, to vector >> + signed long long int. */ >> + vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4}; >> + vec_ll_int_expected = (vector signed long long int){834, -5}; >> + vec_ll_int_result = vec_signede (vec_flt0); >> + test_ll_int_result (vec_ll_int_result, vec_ll_int_expected, >> + "vec_signede"); >> + >> + /* Convert single precision vector float, odd args, to vector >> + signed long long int. */ >> + vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4}; >> + vec_ll_int_expected = (vector signed long long int){14, -3}; >> + vec_ll_int_result = vec_signedo (vec_flt0); >> + test_ll_int_result (vec_ll_int_result, vec_ll_int_expected, >> + "vec_signedo"); >> + >> + /* Convert single precision vector float, even args, to vector >> + unsigned long long int. */ >> + vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4}; >> + vec_ll_uns_int_expected = (vector unsigned long long int){834, 0}; >> + vec_ll_uns_int_result = vec_unsignede (vec_flt0); >> + test_ll_unsigned_int_result (vec_ll_uns_int_result, >> + vec_ll_uns_int_expected, "vec_unsignede"); >> + >> + /* Convert single precision vector float, odd args, to vector >> + unsigned long long int. */ >> + vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4}; >> + vec_ll_uns_int_expected = (vector unsigned long long int){14, 0}; >> + vec_ll_uns_int_result = vec_unsignedo (vec_flt0); >> + test_ll_unsigned_int_result (vec_ll_uns_int_result, >> + vec_ll_uns_int_expected, "vec_unsignedo"); >> + >> /* Convert double precision float to long long unsigned int */ >> vec_dble0 = (vector double){124.930, 8134.49}; >> vec_ll_uns_int_expected = (vector long long unsigned int){124, 8134}; >> vec_ll_uns_int_result = vec_unsigned (vec_dble0); >> test_ll_unsigned_int_result (vec_ll_uns_int_result, >> - vec_ll_uns_int_expected); >> + vec_ll_uns_int_expected, "vec_unsigned"); >> >> /* Convert double precision float to long long unsigned int. Negative >> arguments. */ >> @@ -334,7 +369,7 @@ int main() >> vec_ll_uns_int_expected = (vector long long unsigned int){0, 0}; >> vec_ll_uns_int_result = vec_unsigned (vec_dble0); >> test_ll_unsigned_int_result (vec_ll_uns_int_result, >> - vec_ll_uns_int_expected); >> + vec_ll_uns_int_expected, "vec_unsigned"); >> >> /* Convert double precision vector float to vector unsigned int, >> even words. Negative arguments */ >