On Wednesday, 15 July 2020 at 05:57:56 UTC, tastyminerals wrote:
[snip]

Here is a (WIP) project as of now.
Line 160 in https://github.com/tastyminerals/mir_benchmarks_2/blob/master/source/basic_ops.d

std of [60, 60] matrix 0.0389492 (> 0.001727)
std of [300, 300] matrix 1.03592 (> 0.043452)
std of [600, 600] matrix 4.2875 (> 0.182177)
std of [800, 800] matrix 7.9415 (> 0.345367)

I changed the dflags-ldc to "-mcpu-native -O" and compiled with `dub run --compiler=ldc2`. I got similar results as yours for both in the initial run.

I changed sd to

@fmamath private double sd(T)(Slice!(T*, 1) flatMatrix)
{
    pragma(inline, false);
    if (flatMatrix.empty)
        return 0.0;
    double n = cast(double) flatMatrix.length;
    double mu = flatMatrix.mean;
    return (flatMatrix.map!(a => (a - mu) ^^ 2)
            .sum!"precise" / n).sqrt;
}

and got

std of [10, 10] matrix 0.0016321
std of [20, 20] matrix 0.0069788
std of [300, 300] matrix 2.42063
std of [60, 60] matrix 0.0828711
std of [600, 600] matrix 9.72251
std of [800, 800] matrix 18.1356

And the biggest change by far was the sum!"precise" instead of sum!"fast".

When I ran your benchStd function with
ans = matrix.flattened.standardDeviation!(double, "online", "fast");
I got
std of [10, 10] matrix 1e-07
std of [20, 20] matrix 0
std of [300, 300] matrix 0
std of [60, 60] matrix 1e-07
std of [600, 600] matrix 0
std of [800, 800] matrix 0

I got the same result with Summator.naive. That almost seems too low.

The default is Summator.appropriate, which is resolved to Summator.pairwise in this case. It is faster than Summator.precise, but still slower than Summator.naive or Summator.fast. Your welfordSD should line up with Summator.naive.

When I change that to
ans = matrix.flattened.standardDeviation!(double, "online", "precise");
I get
Running .\mir_benchmarks_2.exe
std of [10, 10] matrix 0.0031737
std of [20, 20] matrix 0.0153603
std of [300, 300] matrix 4.15738
std of [60, 60] matrix 0.171211
std of [600, 600] matrix 17.7443
std of [800, 800] matrix 34.2592

I also tried changing your welfordSD function based on the stuff I mentioned above, but it did not make a large difference.

Reply via email to