Kewen: On Mon, 2023-06-19 at 11:50 +0800, Kewen.Lin wrote: > > generated the vinsd instruction for the two calls with the first > > argument of unsigned long long int. When the first argument of the > > builtin is changed to the correct type, vector unsigned char the > > builtin generates the vinsw instruction instead. The change occurs > > in > > two places resulting in reducing the counts for vinsd by two and > > increasing the counts for vinsw by two. The other calls to the > > builtin > > are either vector ints or vector floats which generate the vinsw > > instruction. Changing the first argument in those calls to vector > > unsigned char still generate the vinsw instruction. > > But it did expose something odd and needed to be handled in this > change. > I had a further check, for the below test case: > > #include "altivec.h" > > #ifdef ORIG > vector unsigned char foo (vector unsigned long long v){ > unsigned long long val = 678ull; > return vec_replace_unaligned (v, val, 7); > } > #else > vector unsigned char foo (vector unsigned long long v){ > unsigned long long val = 678ull; > return vec_replace_unaligned ((vector unsigned char)v, val, 7); > } > #endif > > Without this patch (-DORIG required to match the previous prototype), > it would generate vinsd; while with this proposed patch, it would > generate vinsw. I think it's unexpected since users can still have > the need to replace a doubleword size of chunk but give a constant > which can be represented by int. The previous way can support it, > while the new way can't. So we should have some way to distinguish > it, we have some special-casing in function > altivec_resolve_overloaded_builtin, could you have a check and try > there? Thanks!
I added the needed handling in altivec_resolve_overloaded_builtin to address the issue with the built-in generating the correct instruction for the unsigned long long cases in the test file. I added an additional test file with the above test case. It was put into a new test file as it requires the -flax-vector-conversions argument. I felt that it was best to separate the tests that need/do not need the -flax- vector-conversions argument. Note, adding the additional case statement RS6000_OVLD_VEC_REPLACE_UN to handle the three argument built-in vec_replace_unaligned in altivec_resolve_overloaded_builtin exposed an issue with function find_instance. Function find_instance assumes there are only two arguments in the builtin. There are no checks on the actual number of arguments used by the built-in. This leads to an error in tree_operand_check_failed() when using find_builtin. The find_buitin function was extended to handle 2 or 3 arguments with a check to make sure the number of arguments is either 2 or 3. FYI, I also noticed in the current patch the names in rs6000- builtins.def and rs6000-overload.def for builtin_altivec_vreplace_un still reflect the type of the first argument. The current patch changes the first argument to vuc, but the naming didn't all get updated. I think the names should be changed to reflect the name of the second argument since the first arguments are all identical. For example: -- a/gcc/config/rs6000/rs6000-builtins.def +++ b/gcc/config/rs6000/rs6000-builtins.def @@ -3388,29 +3388,29 @@ const vull __builtin_altivec_vpextd (vull, vull); VPEXTD vpextd {} - const vuc __builtin_altivec_vreplace_un_uv2di (vull, unsigned long long, \ - const int<4>); - VREPLACE_UN_UV2DI vreplace_un_v2di {} + const vuc __builtin_altivec_vreplace_un_udi (vuc, unsigned long long, \ + const int<4>); + VREPLACE_UN_UDI vreplace_un_di {} The name changes will ripple thru files rs6000-builtins.def, rs6000- overload.def and vsx.md. I did all the naming as well in the new version 3 of the patch. Carl