You do not talk about the SSE 4.1 version in your bench.

Have you tried this use case ?

Thanks !

Le 04/07/2022 à 19:23, Martijn van Beurden a écrit :
Op ma 4 jul. 2022 om 15:06 schreef olivier tristan <o.tris...@uvi.net>:

    While I can understand the rationale for manual assembly as 32
    bits x86
    is dead, it seems a greater deal to remove all optimization including
    intrinsic ones.


Yes, it does seem a great deal to remove all optimization, but it really isn't. See the pull request associated with that change for more information: https://github.com/xiph/flac/pull/347 I did quite a bit of testing before merging this change, on two different CPUs, each with 3 different compilers, each with 4 variants of the non-intrinsics-accelerated functions. It turns out that there is no performance loss at all, and in many cases this change makes flac actually faster, not slower as one would expect.

    Maybe there should be a an opt in if you don't want to be included by
    default but some people including me don't want to see those
    optimization been removed ?


There would be no advantage of that over keeping the original code: it still needs to be maintained and tested, even if it is hidden behind some configuration option. The only case where this patch could be problematic in terms of speed is when one compiles flac to be used on CPUs that do not support SSE2.

--
Olivier Tristan
Research & Development
www.uvi.net
_______________________________________________
flac-dev mailing list
flac-dev@xiph.org
http://lists.xiph.org/mailman/listinfo/flac-dev

Reply via email to