Rearranged patches per Mans' request.
Included Loren's x86 asm improvements for extract_exponents().
Updated benchmarks:
Athlon64:
current extract exponents:
63192
patched extract exponents:
11649 - clip - no SIMD
+16257 - extract - SSE2
------
27906 (126% faster)
Sandy Bridge:
current extract exponents:
26924
patched extract exponents:
1812 - clip - SSE4.1
+ 4713 - extract - SSSE3
------
6525 (313% faster)
Justin Ruggles (3):
ac3enc: add int32_t array clipping function to DSPUtil, including x86
versions.
ac3enc: simplify ac3dsp.extract_exponents() by clipping coefficients
prior to exponent extraction.
ac3dsp: add x86-optimized versions of ac3dsp.extract_exponents()
libavcodec/ac3dsp.c | 16 +----
libavcodec/ac3enc.c | 4 +
libavcodec/dsputil.c | 17 ++++++
libavcodec/dsputil.h | 14 +++++
libavcodec/x86/ac3dsp.asm | 62 ++++++++++++++++++++
libavcodec/x86/ac3dsp_mmx.c | 7 ++
libavcodec/x86/dsputil_mmx.c | 24 ++++++++
libavcodec/x86/dsputil_yasm.asm | 118 +++++++++++++++++++++++++++++++++++++++
8 files changed, 249 insertions(+), 13 deletions(-)
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel