Hi Chuck,
this commit breaks 32-bit builds at least of radeonsi and probably
others because malloc()ed structures are only aligned to 8 bytes, see
https://bugs.freedesktop.org/show_bug.cgi?id=96835
I presume there are two possible fixes:
1. Drop the alignment on 32-bit.
2. Align affected
Really it's a workaround to fix bad vectorization in the Intel compiler,
but it doesn't it doesn't hurt for other compilers, even if the performance
difference is marginal if at all, and could only help. If it was
problematic otherwise I'd guard it with an #ifdef _INTEL_COMPILER. I can
update
Am 28.06.2016 um 22:45 schrieb Chuck Atkins:
> This aligns the 4-element color float array to 16 byte boundaries. This
> should allow compiler vectorizers to generate better optimizations.
> Also fixes broken vectorization generated by Intel compiler.
>
> Reported-by: Tim Rowley
On Tue, Jun 28, 2016 at 1:45 PM, Chuck Atkins wrote:
> This aligns the 4-element color float array to 16 byte boundaries. This
> should allow compiler vectorizers to generate better optimizations.
> Also fixes broken vectorization generated by Intel compiler.
>
>
This aligns the 4-element color float array to 16 byte boundaries. This
should allow compiler vectorizers to generate better optimizations.
Also fixes broken vectorization generated by Intel compiler.
Reported-by: Tim Rowley
Signed-off-by: Chuck Atkins