[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 Michael Meissner changed: What|Removed |Added CC||meissner at gcc dot gnu.org --- Comment #12 from Michael Meissner --- The test case actually shows on power8 GCC was generating incorrect code, and power9 is actually doing the right thing. But the test case was written assuming the previous behavior was correct. TL;DNR answer power8 generated STVX instead of STXVD2VX. Power9 generates STXV. To explain what the issue is, we need to go back in history. PowerPC processors (and Power before it) were originally designed for big endian environments. The Altivec instruction set had limited vector save and load instructions (STVX and LVX) which ignored the bottom 4 bits of the address. STVX and LVX did the correct byte swapping if the PowerPC was running in little endian mode. When power7 came out with the VSX instruction set, the vector save and load instructions (STXVD2X, STXV4X, LXVD2X, and LXV4X) were added. These instructions allowed saving and loading all 64 VSX registers (32 registers that overlapped with floating point registers, and 32 registers that overlapped with traditional Altivec registers). However, these instructions only store and load values using big endian ordering. After the power8 came out, the PowerPC Linux systems were moved from being big endian to little endian. This meant that after doing a vector load instruction, we had to do explicit byte swapping, and before a vector save we had to do the byte swapping of the value before doing the save. We added an optimization to GCC that in the special case of storing/loading temporaries on the stack, we would use the Altivec instructions STVX and LVX and elimiante the byte swapping instructions since we could insure that all temporaries were correctly aligned. But we couldn't use STVX and LVX in general due to these instructions ignoring the bottom 4 bits of the address and they restricted the vector registers to just the VSX registers that overlap with the Altivec registers. When power9 came out, we added new vector store and load instructions (STXV, STXVX, LXV, and LXVX) that did the correct byte swapping on little endian systems. GCC now generates these instructions and eliminates the special code to use the Altivec STVX and LVX instructions. In the test case, VerifyVecEqToActual takes 2 vector arguments, and creates 2 16 byte arrays, and stores each vector into the array. It uses reinterpret_cast to convert this into a store instruction. However, since the temporary is on the stack, on power8 this uses the Altivec STVX instruction and it gets byte swapped.
[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #11 from John Platts --- Created attachment 55869 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55869&action=edit Test program to reproduce GCC 12 compilation bug Here is the expected output of the ppc9_test_sat_add_090923.cpp test program: Test completed successfully The ppc9_test_sat_add_090923.cpp test program results in the following output when compiled with GCC 12 with -O2 -mcpu=power9 -std=c++14: Expected vector not equal to actual Aborted (core dumped) The ppc9_test_sat_add_090923.cpp test program does generate the expected results when compiled with GCC 11 with the -O2 -mcpu=power9 -std=c++14 options.
[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #10 from John Platts --- Created attachment 55868 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55868&action=edit Test program to reproduce SatWidenMulPairwiseAdd compilation bug The ppc9_test_sat_widen_pairwise_add_090923_2b.cpp compiles successfully with GCC 12, even if compiled with the -O2 -mcpu=power9 options.
[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #9 from John Platts --- Created attachment 55867 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55867&action=edit Test program to reproduce SatWidenMulPairwiseAdd compilation bug The attached ppc9_test_sat_widen_pairwise_add_090923_2.cpp test program generates the following error if compiled with the -O2 -mcpu=power9 options: Mismatch in lane 0 (ppc9_test_sat_widen_pairwise_add_090923_2.cpp:341): Expected: -5376, -5632, -5888, -6144, -6400, -6656, -6912, -7168 Actual: -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768 Aborted (core dumped) Here is the expected output of the ppc9_test_sat_widen_pairwise_add_090923_2.cpp test program: Test completed successfully
[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #8 from Andrew Pinski --- (In reply to Mathieu Malaterre from comment #7) > @rguenth You added `needs-bisection` keyword, but the example is quite > small: 154 lines of code. needs-bisection means to figure out which commit caused the regression. That is different from needs-reduction keyword which says we need a reduction still ...
[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #7 from Mathieu Malaterre --- @rguenth You added `needs-bisection` keyword, but the example is quite small: 154 lines of code.
[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #6 from John Platts --- Need to use revision ff1ad85a96c0bc8483b582d6dbceb8bc07edd226 of Google Highway to reproduce the PPC9 codegen bug with GCC 12 as the TestSatWidenMulPairwiseAdd will now pass on PPC9 due to a recent update to TestSatWidenMulPairwiseAdd in the master branch of the Google Highway Git repository.
[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #5 from John Platts --- The version of Google Highway with the TestSatWidenMulPairwiseAdd changes to get TestSatWidenMulPairwiseAdd to pass successfully on POWER9 with the "-mcpu=power9 -DHWY_DISABLED_TARGETS=6918232715082858496 -DHWY_BROKEN_TARGETS=0" options can be found in the jep_hwy_ppc9_multest_diff_080923 branch in the https://github.com/johnplatts/jep_google_highway.git repository.
[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #4 from John Platts --- I had made some changes to TestSatWidenMulPairwiseAdd in hwy/tests/mul_test.cc that would get TestSatWidenMulPairwiseAdd to pass successfully on POWER9 when compiled with GCC 12 with the "-mcpu=power9 -DHWY_DISABLED_TARGETS=6918232715082858496 -DHWY_BROKEN_TARGETS=0" options. The version of Google Highway with the TestSatWidenMulPairwiseAdd changes to get TestSatWidenMulPairwiseAdd to pass successfully on POWER9 with the "-mcpu=power9 -DHWY_DISABLED_TARGETS=6918232715082858496 -DHWY_BROKEN_TARGETS=0" options can be found in the jep_hwy_ppc9_multest_diff_080923 repository at https://github.com/johnplatts/jep_google_highway.git.
[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #3 from John Platts --- Here is the output of running the "./tests/mul_test" program in the Google Highway test suite when compiled with the "-mcpu=power8 -DHWY_DISABLED_TARGETS=6917951240106147840" options when compiled with GCC 12: [==] Running 9 tests from 1 test suite. [--] Global test environment set-up. [--] 9 tests from HwyMulTestGroup/HwyMulTest [ RUN ] HwyMulTestGroup/HwyMulTest.TestAllMul/PPC8 [ OK ] HwyMulTestGroup/HwyMulTest.TestAllMul/PPC8 (11 ms) [ RUN ] HwyMulTestGroup/HwyMulTest.TestAllMulHigh/PPC8 [ OK ] HwyMulTestGroup/HwyMulTest.TestAllMulHigh/PPC8 (4 ms) [ RUN ] HwyMulTestGroup/HwyMulTest.TestAllMulFixedPoint15/PPC8 [ OK ] HwyMulTestGroup/HwyMulTest.TestAllMulFixedPoint15/PPC8 (67 ms) [ RUN ] HwyMulTestGroup/HwyMulTest.TestAllMulEven/PPC8 [ OK ] HwyMulTestGroup/HwyMulTest.TestAllMulEven/PPC8 (2 ms) [ RUN ] HwyMulTestGroup/HwyMulTest.TestAllMulAdd/PPC8 [ OK ] HwyMulTestGroup/HwyMulTest.TestAllMulAdd/PPC8 (20 ms) [ RUN ] HwyMulTestGroup/HwyMulTest.TestAllWidenMulPairwiseAdd/PPC8 [ OK ] HwyMulTestGroup/HwyMulTest.TestAllWidenMulPairwiseAdd/PPC8 (2 ms) [ RUN ] HwyMulTestGroup/HwyMulTest.TestAllSatWidenMulPairwiseAdd/PPC8 [ OK ] HwyMulTestGroup/HwyMulTest.TestAllSatWidenMulPairwiseAdd/PPC8 (6 ms) [ RUN ] HwyMulTestGroup/HwyMulTest.TestAllReorderWidenMulAccumulate/PPC8 [ OK ] HwyMulTestGroup/HwyMulTest.TestAllReorderWidenMulAccumulate/PPC8 (4 ms) [ RUN ] HwyMulTestGroup/HwyMulTest.TestAllRearrangeToOddPlusEven/PPC8 [ OK ] HwyMulTestGroup/HwyMulTest.TestAllRearrangeToOddPlusEven/PPC8 (0 ms) [--] 9 tests from HwyMulTestGroup/HwyMulTest (122 ms total) [--] Global test environment tear-down [==] 9 tests from 1 test suite ran. (127 ms total) [ PASSED ] 9 tests.
[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #2 from John Platts --- Created attachment 55711 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55711&action=edit Test program to reproduce SatWidenMulPairwiseAdd compilation bug (requires CMake and Google Highway)
[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway Test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #1 from John Platts --- Created attachment 55710 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55710&action=edit Test program to reproduce SatWidenMulPairwiseAdd compilation bug The attached ppc9_sat_widen_mul_pairwise_add_test_080823_4.cpp program needs to be compiled with the "-mcpu=power9 -std=c++17" options, but has no dependencies on Google Highway or CMake. The attached ppc9_sat_widen_mul_pairwise_add_test_080823_4 program runs successfully if compiled with the "-mcpu=power9 -std=c++17" option with GCC 9.