On Tue, 2025-09-23 at 20:10 +0530, Surya Kumari Jangala wrote:
>
>
>
> What is the code generated for this testcase w/o your patch for P10
> and P8?
>
After looking into it a bit further I did find an issue which can be
fixed for left shift case, but not sure how to do it for multiplication
case.
For left shift case, the original code is
vector unsigned long long
lshift1_64 (vector unsigned long long a)
{
return a << (vector unsigned long long) { 1, 1 };
}
Since the constant is one, the check in function
vspltisw_vupkhsw_constant_p returned false, but it should have returned
false for -1 as per the commit message. Correcting that value has fixed
this test case, and no other regression in ppc64le. So I can send that
as second patch in this sequence or separately.
But for this test case
vector unsigned long long
lshift1_64 (vector unsigned long long a, vector unsigned long long b)
{
return a * (vector unsigned long long) { 2, 2 };
}
The issue still exists, because the pattern recognition which is done
in vect_pattern_recog is not done (the question that I have asked in
gcc mailing list).
Before the fix for p8
xxpermdi 0,34,34,3
mfvsrd 9,34
mfvsrd 10,0
sldi 9,9,1
mtvsrd 0,9
sldi 10,10,1
mtvsrd 34,10
xxpermdi 34,0,34,0
blr
For p10
vspltisw v0,1
vsld v2,v2,v0
blr
After the fix for p8 the code is same and in p10 we get vaddudm.
The veclower pass scalarized the vector function since it does not find
a optab for mulv2di3 for power8, and by the time we are in expand pass
there is no vector code.
Thanks and regards,
Avinash Jayakar