On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote:
float euclideanDistanceFixedSizeArray(float[3] a, float[3] b) {

Use __vector(float[4]), not float[3].

  float distance;

The default value for float is float.nan. You need to explicitly initialize it to 0.0f or something if you want this function to actually do anything useful.

  a[] -= b[];
  a[] *= a[];

With __vector types, this can be simplified (not optimized) to just:
    a -= b;
    a *= a;

  static foreach(size_t i; 0 .. 3/+typeof(a).length+/){
      distance += a[i].abs;//abs required by the caller

(a * a) above is always positive for real numbers. You don't need the call to abs unless you're trying to guarantee that even nan values will have a clear sign bit.

Also, there is no point to adding the first component to zero, and copying element [0] from a SIMD register into a scalar is free, so this can become:

    float distance = a[0];
    static foreach(size_t i; 1 .. 3)
        distance += a[i];

  }
  return sqrt(distance);
}

Final assembly output (ldc 1.24.0 with -release -O3 -preview=intpromote -preview=dip1000 -m64 -mcpu=haswell -fp-contract=fast -enable-cross-module-inlining):

    vsubps  xmm0, xmm1, xmm0
    vmulps  xmm0, xmm0, xmm0
    vmovshdup       xmm1, xmm0
    vaddss  xmm1, xmm0, xmm1
    vpermilpd       xmm0, xmm0, 1
    vaddss  xmm0, xmm0, xmm1
    vsqrtss xmm0, xmm0, xmm0
    ret

Reply via email to