On Sunday, 7 March 2021 at 22:54:32 UTC, tsbockman wrote:
...
result = diffSq[0];
static foreach(i; 0 .. 3)
result += diffSq[i];
...
Oops, that's supposed to say `i; 1 .. 3`. Fixed:
import std.meta : Repeat;
void euclideanDistanceFixedSizeArray(V)(ref Repeat!(3, const(V))
a, r
On Sunday, 7 March 2021 at 22:54:32 UTC, tsbockman wrote:
import std.meta : Repeat;
void euclideanDistanceFixedSizeArray(V)(ref Repeat!(3,
const(V)) a, ref Repeat!(3, const(V)) b, out V result)
if(is(V : __vector(float[length]), size_t length))
...
Resulting asm with is(V == __vector(float
On Sunday, 7 March 2021 at 18:00:57 UTC, z wrote:
On Friday, 26 February 2021 at 03:57:12 UTC, tsbockman wrote:
static foreach(size_t i; 0 .. 3/+typeof(a).length+/){
distance += a[i].abs;//abs required by the caller
(a * a) above is always positive for real numbers. You don't
need the
On Sunday, 7 March 2021 at 13:26:37 UTC, z wrote:
On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote:
However, AVX512 support seems limited to being able to use the
16 other YMM registers, rather than using the same code
template but changed to use ZMM registers and double the
offsets to t
On Sunday, 7 March 2021 at 14:15:58 UTC, z wrote:
On Thursday, 25 February 2021 at 14:28:40 UTC, Guillaume Piolat
wrote:
On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote:
How does one optimize code to make full use of the CPU's SIMD
capabilities?
Is there any way to guarantee that "packed
On Friday, 26 February 2021 at 03:57:12 UTC, tsbockman wrote:
static foreach(size_t i; 0 .. 3/+typeof(a).length+/){
distance += a[i].abs;//abs required by the caller
(a * a) above is always positive for real numbers. You don't
need the call to abs unless you're trying to guarantee that
On Thursday, 25 February 2021 at 14:28:40 UTC, Guillaume Piolat
wrote:
On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote:
How does one optimize code to make full use of the CPU's SIMD
capabilities?
Is there any way to guarantee that "packed" versions of SIMD
instructions will be used?(e.g.
On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote:
...
It seems that using static foreach with pointer parameters
exclusively is the best way to "guide" LDC into optimizing
code.(using arr1[] += arr2[] syntax resulted in worse performance
for me.)
However, AVX512 support seems limited to
On Thursday, 25 February 2021 at 14:28:40 UTC, Guillaume Piolat
wrote:
On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote:
How does one optimize code to make full use of the CPU's SIMD
capabilities?
Is there any way to guarantee that "packed" versions of SIMD
instructions will be used?(e.g.
On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote:
How does one optimize code to make full use of the CPU's SIMD
capabilities?
Is there any way to guarantee that "packed" versions of SIMD
instructions will be used?(e.g. vmulps, vsqrtps, etc...)
To give some context, this is a sample of one
On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote:
float euclideanDistanceFixedSizeArray(float[3] a, float[3] b) {
Use __vector(float[4]), not float[3].
float distance;
The default value for float is float.nan. You need to explicitly
initialize it to 0.0f or something if you want th
On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote:
Is there any way to guarantee that "packed" versions of SIMD
instructions will be used?(e.g. vmulps, vsqrtps, etc...)
To give some context, this is a sample of one of the functions
that could benefit from better SIMD usage :
float euclidea
On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote:
How does one optimize code to make full use of the CPU's SIMD
capabilities?
Is there any way to guarantee that "packed" versions of SIMD
instructions will be used?(e.g. vmulps, vsqrtps, etc...)
https://code.dlang.org/packages/intel-intrin
How does one optimize code to make full use of the CPU's SIMD
capabilities?
Is there any way to guarantee that "packed" versions of SIMD
instructions will be used?(e.g. vmulps, vsqrtps, etc...)
To give some context, this is a sample of one of the functions
that could benefit from better SIMD usa
14 matches
Mail list logo