Rearranged patches per Mans' request.
Included Loren's x86 asm improvements for extract_exponents().

Updated benchmarks:

Athlon64:
current extract exponents:
 63192
patched extract exponents:
 11649 - clip    - no SIMD
+16257 - extract - SSE2
------
 27906 (126% faster)

Sandy Bridge:
current extract exponents:
 26924
patched extract exponents:
  1812 - clip    - SSE4.1
+ 4713 - extract - SSSE3
------
  6525 (313% faster)


Justin Ruggles (3):
  ac3enc: add int32_t array clipping function to DSPUtil, including x86
    versions.
  ac3enc: simplify ac3dsp.extract_exponents() by clipping coefficients
    prior     to exponent extraction.
  ac3dsp: add x86-optimized versions of ac3dsp.extract_exponents()

 libavcodec/ac3dsp.c             |   16 +----
 libavcodec/ac3enc.c             |    4 +
 libavcodec/dsputil.c            |   17 ++++++
 libavcodec/dsputil.h            |   14 +++++
 libavcodec/x86/ac3dsp.asm       |   62 ++++++++++++++++++++
 libavcodec/x86/ac3dsp_mmx.c     |    7 ++
 libavcodec/x86/dsputil_mmx.c    |   24 ++++++++
 libavcodec/x86/dsputil_yasm.asm |  118 +++++++++++++++++++++++++++++++++++++++
 8 files changed, 249 insertions(+), 13 deletions(-)
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to