https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
--- Comment #13 from Richard Biener ---
(In reply to Matthias Kretz (Vir) from comment #12)
> (In reply to rguent...@suse.de from comment #11)
> > On Wed, 17 Jul 2024, mkretz at gcc dot gnu.org wrote:
> > > Unless the target has a masked load in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
--- Comment #12 from Matthias Kretz (Vir) ---
(In reply to rguent...@suse.de from comment #11)
> On Wed, 17 Jul 2024, mkretz at gcc dot gnu.org wrote:
> > Unless the target has a masked load instruction (e.g. AVX512) or ptr is
> > known
> > to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
--- Comment #11 from rguenther at suse dot de ---
On Wed, 17 Jul 2024, mkretz at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
>
> --- Comment #10 from Matthias Kretz (Vir) ---
> (In reply to Richard Biener from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
--- Comment #10 from Matthias Kretz (Vir) ---
(In reply to Richard Biener from comment #9)
> One issue with
>
> V load3(const unsigned long* ptr)
> {
> V ret = {};
> __builtin_memcpy(&ret, ptr, 3 * sizeof(unsigned long));
>
> is that we ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
Richard Biener changed:
What|Removed |Added
CC||jamborm at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
--- Comment #8 from Imple Lee ---
I tried another way to permute the register.
Although GCC does generate simd instructions, the generated code is
sub-optimal.
I opened PR114966 for that.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
--- Comment #7 from Matthias Kretz (Vir) ---
I suspect resolving this is only one part of it. But I'm happy to be proven
wrong. :)
I opened PR114958 to track the simd implementation change.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
--- Comment #6 from Richard Biener ---
Thanks, and it might be enough to handle
typedef unsigned long V [[gnu::vector_size(32)]];
V load3(const unsigned long* ptr)
{
V ret = {};
__builtin_memcpy(&ret, ptr, 3 * sizeof(unsigned long));
ret
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
--- Comment #5 from Matthias Kretz (Vir) ---
https://godbolt.org/z/P6cfbjT9f
#include
typedef uint64_t T;
typedef T V [[gnu::vector_size(32)]];
typedef struct simd4 {
V data;
} simd4;
typedef struct simd1 {
T data;
} simd1;
typede
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
--- Comment #4 from rguenther at suse dot de ---
On Mon, 6 May 2024, mkretz at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
>
> --- Comment #3 from Matthias Kretz (Vir) ---
> The stdx::simd implementation in th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
--- Comment #3 from Matthias Kretz (Vir) ---
The stdx::simd implementation in this area is old and mainly tuned to be
correct. I can rewrite the split and concat implementation to use
__builtin_shufflevector (which wasn't available in GCC at the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
Richard Biener changed:
What|Removed |Added
CC||mkretz at gcc dot gnu.org,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
Andrew Pinski changed:
What|Removed |Added
Component|target |tree-optimization
Ever confirmed|0
13 matches
Mail list logo