https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
Jeffrey A. Law changed:
What|Removed |Added
Priority|P3 |P2
CC|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
Filip Kastl changed:
What|Removed |Added
CC||pheeck at gcc dot gnu.org
--- Comment #7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
--- Comment #6 from Hongtao Liu ---
Guess explicit .REDUC_PLUS instead of original VEC_PERM_EXPR somehow impacts
the store split decision.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
--- Comment #5 from Hongtao Liu ---
It looks like x264_pixel_satd_16x16 consumes more time after my commit, an
extracted case is as below, note there's no attribute((always_inline)) in the
original x264_pixel_satd_8x4, it's added to force
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
--- Comment #4 from Martin Jambor ---
(In reply to Hongtao Liu from comment #2)
> A patch is posted at
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640276.html
>
> Would you give a try to see if it fixes the regression, I don't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
--- Comment #3 from Richard Biener ---
I'll note that esp. two-lane reductions (or in general two-lane BB
vectorization) is hardly profitable on modern x86 uarchs unless the vectorized
code is interleaved with other non-vectorized code that can
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
--- Comment #2 from Hongtao Liu ---
A patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640276.html
Would you give a try to see if it fixes the regression, I don't currently have
a znver4 machine for testing.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
--- Comment #1 from Hongtao Liu ---
Guess it's same issue as PR112879?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
Andrew Pinski changed:
What|Removed |Added
Target Milestone|--- |14.0
Keywords|