Currently coefficients and exponents are clipped at the same time, making
the extract_exponents() function unnecessarily complex. This patch set
makes the encoder clip the MDCT coefficients prior to exponent extraction,
which allows for easier (and more specific selection of) optimizations for
each function.

The coefficient clipping could be combined with some other optimized function
such as coefficient scaling to possibly improve overall speed, but
unfortunately clipping on x86 without sse4.1 is quite slow. On Athlon64 the
non-SIMD version is actually faster than SSE2 due to fast cmov instructions
and slow SSE2.

Here are some benchmarks:

Athlon64:
current extract exponents:
 63192
patched extract exponents:
 11649 - clip    - no SIMD
+19737 - extract - SSE2
------
 31386 (101% faster)

Sandy Bridge:
current extract exponents:
 26924
patched extract exponents:
  1812 - clip    - SSE4.1
+ 5943 - extract - SSSE3
------
 7755 (247% faster)

Atom 330:
current extract exponents:
133187
patched extract exponents:
 14705 - clip    - SSE2
+27533 - extract - SSE2
------
 42238 (215% faster)

Justin Ruggles (4):
  ac3enc: extract all exponents for the frame at once
  ac3enc: simplify ac3dsp.extract_exponents() by clipping coefficients
    prior     to exponent extraction.
  ac3enc: move int32_t array clipping function to DSPUtil and add x86
    versions.
  ac3dsp: add x86-optimized versions of ac3dsp.extract_exponents()

 libavcodec/ac3dsp.c             |   16 +----
 libavcodec/ac3enc.c             |   17 +++--
 libavcodec/dsputil.c            |   17 +++++
 libavcodec/dsputil.h            |   14 ++++
 libavcodec/x86/ac3dsp.asm       |   72 +++++++++++++++++++
 libavcodec/x86/ac3dsp_mmx.c     |    5 ++
 libavcodec/x86/dsputil_mmx.c    |   27 +++++++
 libavcodec/x86/dsputil_yasm.asm |  146 +++++++++++++++++++++++++++++++++++++++
 8 files changed, 293 insertions(+), 21 deletions(-)
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to