[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-08-19 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 --- Comment #13 from Richard Biener --- (In reply to Matthias Kretz (Vir) from comment #12) > (In reply to rguent...@suse.de from comment #11) > > On Wed, 17 Jul 2024, mkretz at gcc dot gnu.org wrote: > > > Unless the target has a masked load in

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-08-18 Thread mkretz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 --- Comment #12 from Matthias Kretz (Vir) --- (In reply to rguent...@suse.de from comment #11) > On Wed, 17 Jul 2024, mkretz at gcc dot gnu.org wrote: > > Unless the target has a masked load instruction (e.g. AVX512) or ptr is > > known > > to

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-07-17 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 --- Comment #11 from rguenther at suse dot de --- On Wed, 17 Jul 2024, mkretz at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 > > --- Comment #10 from Matthias Kretz (Vir) --- > (In reply to Richard Biener from

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-07-17 Thread mkretz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 --- Comment #10 from Matthias Kretz (Vir) --- (In reply to Richard Biener from comment #9) > One issue with > > V load3(const unsigned long* ptr) > { > V ret = {}; > __builtin_memcpy(&ret, ptr, 3 * sizeof(unsigned long)); > > is that we ca

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-07-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 Richard Biener changed: What|Removed |Added CC||jamborm at gcc dot gnu.org --- Comment

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-05-06 Thread lee.imple at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 --- Comment #8 from Imple Lee --- I tried another way to permute the register. Although GCC does generate simd instructions, the generated code is sub-optimal. I opened PR114966 for that.

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-05-06 Thread mkretz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 --- Comment #7 from Matthias Kretz (Vir) --- I suspect resolving this is only one part of it. But I'm happy to be proven wrong. :) I opened PR114958 to track the simd implementation change.

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-05-06 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 --- Comment #6 from Richard Biener --- Thanks, and it might be enough to handle typedef unsigned long V [[gnu::vector_size(32)]]; V load3(const unsigned long* ptr) { V ret = {}; __builtin_memcpy(&ret, ptr, 3 * sizeof(unsigned long)); ret

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-05-06 Thread mkretz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 --- Comment #5 from Matthias Kretz (Vir) --- https://godbolt.org/z/P6cfbjT9f #include typedef uint64_t T; typedef T V [[gnu::vector_size(32)]]; typedef struct simd4 { V data; } simd4; typedef struct simd1 { T data; } simd1; typede

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-05-06 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 --- Comment #4 from rguenther at suse dot de --- On Mon, 6 May 2024, mkretz at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 > > --- Comment #3 from Matthias Kretz (Vir) --- > The stdx::simd implementation in th

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-05-06 Thread mkretz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 --- Comment #3 from Matthias Kretz (Vir) --- The stdx::simd implementation in this area is old and mainly tuned to be correct. I can rewrite the split and concat implementation to use __builtin_shufflevector (which wasn't available in GCC at the

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-05-02 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 Richard Biener changed: What|Removed |Added CC||mkretz at gcc dot gnu.org,

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-05-01 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908 Andrew Pinski changed: What|Removed |Added Component|target |tree-optimization Ever confirmed|0