On Mon, 9 Nov 2020 at 17:21, Étienne Mollier <etienne.moll...@mailoo.org>
wrote:

> Howdy,
>
> I'm filling a wishlist item in the bug tracker, so that the
> discussion does not disappear inside mail archives.  I gave a
> try to shapeit4 autopkgtest suite with and without FMA & AVX2
> support, but it had a run time of 1m25s in both cases on my
> machine (Ryzen 5 3600 w/ 6 cores).  It is quite possible I
> neglected some other bottlenecks though, but the assembler did
> embed AVX2 instructions when I checked the build result.  Out of
> curiosity, has someone figures on the performance gain for that
> software when extensions are available?
>
> Michael R. Crusoe, on 2020-11-05 21:26:30 +0100:
> > As documented at
> > https://wiki.debian.org/SIMDEverywhere
>
> shapeit4 provides a dedicated code path for "-mfma -mavx2" build
> options, and another one for generic builds.  Is it still worth
> using SIMDe in this particular situation?  The "use case"
> paragraph of the wiki page seems to suggest it is not strictly
> needed here.
>

Given the fallback route that doesn't use SIMD, then implementing our own
is not necessary, however compiling the FMA+AVX2 path using SIMDe on
non-x86 archs may result in a speedup for them.

Would be best to get a bigger training dataset to confirm the benefit, or
at least the lack of regression :-)

If a performance benefit is observed, it might be interesting to see if the
AVX-only and "lower" SIMD levels on x86 also experience a speed up.

Reply via email to