[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2022-11-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #21 from CVS Commits --- The master branch has been updated by Hongyu Wang : https://gcc.gnu.org/g:dc95e1e9702f2f6367bbc108c8d01169be1b66d2 commit r13-4044-gdc95e1e9702f2f6367bbc108c8d01169be1b66d2 Author: Hongyu Wang Date: Mon

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2022-01-11 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #20 from Richard Biener --- -fno-trapping-math tells us we are not concerned about FP exception flags (so say spurious FP_INEXACT is OK), -fno-signalling-nans is needed as well I guess. Oh, and in practice performing the

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-08-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #19 from Andrew Pinski --- (In reply to Hongtao.liu from comment #18) > For vector integers it should be ok? > For vector floating point we can add condition > flag_unsafe_math_optimizations || !flag_trapping_math for the

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-08-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #18 from Hongtao.liu --- (In reply to Andrew Pinski from comment #17) > (In reply to Hongtao.liu from comment #16) > > typedef int v4si __attribute__ ((vector_size(16))); > > > > v4si f(v4si a, v4si b) { > > v4si a1 =

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-08-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #17 from Andrew Pinski --- (In reply to Hongtao.liu from comment #16) > typedef int v4si __attribute__ ((vector_size(16))); > > v4si f(v4si a, v4si b) { > v4si a1 = __builtin_shufflevector (a, a, 2, 3 ,1 ,0); > v4si b1 =

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-08-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #16 from Hongtao.liu --- typedef int v4si __attribute__ ((vector_size(16))); v4si f(v4si a, v4si b) { v4si a1 = __builtin_shufflevector (a, a, 2, 3 ,1 ,0); v4si b1 = __builtin_shufflevector (b, a, 2, 3 ,1 ,0); return a1

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-08-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #15 from Hongtao.liu --- (In reply to Andrew Pinski from comment #14) > (In reply to Hongtao.liu from comment #13) > > fold shulfps to vec_perm_exp, but still 2 shulfps are generated. > > > > __m128 f (__m128 a, __m128 b) > > { > >

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-08-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #14 from Andrew Pinski --- (In reply to Hongtao.liu from comment #13) > fold shulfps to vec_perm_exp, but still 2 shulfps are generated. > > __m128 f (__m128 a, __m128 b) > { > vector(4) float _3; > vector(4) float _5; >

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-08-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #13 from Hongtao.liu --- fold shulfps to vec_perm_exp, but still 2 shulfps are generated. __m128 f (__m128 a, __m128 b) { vector(4) float _3; vector(4) float _5; vector(4) float _6; ;; basic block 2, loop depth 0 ;;

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-08-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #12 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:0fa4787bf34b173ce6f198e99b6f6dd8a3f98014 commit r12-3177-g0fa4787bf34b173ce6f198e99b6f6dd8a3f98014 Author: liuhongt Date: Fri Dec

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-01-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-01-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #10 from Hongtao.liu --- A patch is posted at https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561909.html And record jakub comments in another thread On Tue, Jan 12, 2021 at 11:47:48AM +0100, Jakub Jelinek via Gcc-patches

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2020-12-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #9 from Hongtao.liu --- (In reply to Marc Glisse from comment #8) > (In reply to Richard Biener from comment #4) > > We already handle IX86_BUILTIN_SHUFPD there but not IX86_BUILTIN_SHUFPS for > > some reason. > >

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2020-12-08 Thread glisse at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #8 from Marc Glisse --- (In reply to Richard Biener from comment #4) > We already handle IX86_BUILTIN_SHUFPD there but not IX86_BUILTIN_SHUFPS for > some reason. https://gcc.gnu.org/pipermail/gcc-patches/2019-May/521983.html I was

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2020-12-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #7 from Richard Biener --- The transform with doubles on the [1] element would produce unpckhpd%xmm1, %xmm1 unpckhpd%xmm0, %xmm0 mulsd %xmm1, %xmm0 unpcklpd%xmm0, %xmm0 so

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2020-12-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #6 from Richard Biener --- Created attachment 49695 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49695=edit vector lowering ssa_uniform_vector_p hack

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2020-12-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #5 from Richard Biener --- So __m128d f(__m128d a, __m128d b) { return _mm_mul_pd(_mm_shuffle_pd(a, a, 0), _mm_shuffle_pd(b, b, 0)); } is expanded as _3 = VEC_PERM_EXPR ; _5 = VEC_PERM_EXPR ; _6 = _3 * _5; return _6;

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2020-12-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #4 from Richard Biener --- That works only for single-operation and doesn't really scale. I think we want to expose the permutes at the GIMPLE level via ix86_gimple_fold_builtin. We already handle IX86_BUILTIN_SHUFPD there but not

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2020-12-07 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #3 from Hongtao.liu --- ;; _3 = __builtin_ia32_shufps (b_2(D), b_2(D), 0); (insn 7 6 8 (set (reg:V4SF 88) (reg/v:V4SF 86 [ b ])) "./gcc/include/xmmintrin.h":746:19 -1 (nil)) (insn 8 7 9 (set (reg:V4SF 89)

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2020-12-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 Richard Biener changed: What|Removed |Added Target|x86_64 i?86 |x86_64-*-* i?86-*-*

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2020-12-06 Thread gabravier at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 Gabriel Ravier changed: What|Removed |Added Summary|[x86] Failure to optimize |[x86] Failure to optimize