On Mon, Jan 04, 2016 at 06:33:59PM -0800, Ganesh Ajjanagadde wrote: > This exploits an approach based on the sieve of Eratosthenes, a popular > method for generating prime numbers. > > Tables are identical to previous ones. > > Tested with FATE with/without --enable-hardcoded-tables. > > Sample benchmark (Haswell, GNU/Linux+gcc): > prev: > 7860100 decicycles in cbrt_tableinit, 1 runs, 0 skips > 7777490 decicycles in cbrt_tableinit, 2 runs, 0 skips > [...] > 7582339 decicycles in cbrt_tableinit, 256 runs, 0 skips > 7563556 decicycles in cbrt_tableinit, 512 runs, 0 skips > > new: > 2099480 decicycles in cbrt_tableinit, 1 runs, 0 skips > 2044470 decicycles in cbrt_tableinit, 2 runs, 0 skips > [...] > 1796544 decicycles in cbrt_tableinit, 256 runs, 0 skips > 1791631 decicycles in cbrt_tableinit, 512 runs, 0 skips > > Both small and large run count given as this is called once so small run > count may give a better picture, small numbers are fairly consistent, > and there is a consistent downward trend from small to large runs, > at which point it stabilizes to a new value. > > Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> > --- > libavcodec/aacdec_fixed.c | 4 +-- > libavcodec/aacdec_template.c | 2 +- > libavcodec/cbrt_tablegen.h | 53 > ++++++++++++++++++++++++++----------- > libavcodec/cbrt_tablegen_template.c | 12 ++++++++- > 4 files changed, 51 insertions(+), 20 deletions(-) > > diff --git a/libavcodec/aacdec_fixed.c b/libavcodec/aacdec_fixed.c > index 396a874..f7b882b 100644 > --- a/libavcodec/aacdec_fixed.c > +++ b/libavcodec/aacdec_fixed.c > @@ -155,9 +155,9 @@ static void vector_pow43(int *coefs, int len) > for (i=0; i<len; i++) { > coef = coefs[i]; > if (coef < 0) > - coef = -(int)cbrt_tab[-coef]; > + coef = -(int)cbrt_tab[-coef].i; > else > - coef = (int)cbrt_tab[coef]; > + coef = (int)cbrt_tab[coef].i; > coefs[i] = coef; > } > } > diff --git a/libavcodec/aacdec_template.c b/libavcodec/aacdec_template.c > index d819958..1380510 100644 > --- a/libavcodec/aacdec_template.c > +++ b/libavcodec/aacdec_template.c > @@ -1791,7 +1791,7 @@ static int decode_spectrum_and_dequant(AACContext *ac, > INTFLOAT coef[1024], > v = -v; > *icf++ = v; > #else > - *icf++ = cbrt_tab[n] | (bits & 1U<<31); > + *icf++ = cbrt_tab[n].i | (bits & 1U<<31); > #endif /* USE_FIXED */ > bits <<= 1; > } else { > diff --git a/libavcodec/cbrt_tablegen.h b/libavcodec/cbrt_tablegen.h > index 59b5a1d..e3d6634 100644 > --- a/libavcodec/cbrt_tablegen.h > +++ b/libavcodec/cbrt_tablegen.h > @@ -26,14 +26,13 @@ > #include <stdint.h> > #include <math.h> > #include "libavutil/attributes.h" > +#include "libavutil/intfloat.h" > #include "libavcodec/aac_defines.h" > > -#if USE_FIXED > -#define CBRT(x) lrint((x).f * 8192) > -#else > -#define CBRT(x) x.i > -#endif > -
> +union ff_int32float64 { > + uint32_t i; > + double f; > +}; > #if CONFIG_HARDCODED_TABLES > #if USE_FIXED > #define cbrt_tableinit_fixed() > @@ -43,20 +42,42 @@ > #include "libavcodec/cbrt_tables.h" > #endif > #else > -static uint32_t cbrt_tab[1 << 13]; > +static union ff_int32float64 cbrt_tab[1 << 13]; this doubles the size of the cpu cache needed at runtime to store the same number of elements [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB it is not once nor twice but times without number that the same ideas make their appearance in the world. -- Aristotle
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel