[FFmpeg-devel] [PATCH v2] aacenc: add SIMD optimizations for abs_pow34 and quantization

2016-10-08 Thread Rostislav Pehlivanov
Performance improvements: quant_bands: with: 681 decicycles in quant_bands, 8388453 runs,155 skips without: 1190 decicycles in quant_bands, 8388386 runs,222 skips Around 42% for the function Twoloop coder: abs_pow34: with/without: 7.82s/8.17s Around 4% for the entire encoder Both: w

Re: [FFmpeg-devel] [PATCH v2] aacenc: add SIMD optimizations for abs_pow34 and quantization

2016-10-08 Thread Michael Niedermayer
On Sat, Oct 08, 2016 at 06:42:28PM +0100, Rostislav Pehlivanov wrote: > Performance improvements: > > quant_bands: > with: 681 decicycles in quant_bands, 8388453 runs,155 skips > without: 1190 decicycles in quant_bands, 8388386 runs,222 skips > Around 42% for the function > > Twoloop

Re: [FFmpeg-devel] [PATCH v2] aacenc: add SIMD optimizations for abs_pow34 and quantization

2016-10-09 Thread Rostislav Pehlivanov
On 9 October 2016 at 03:18, Michael Niedermayer wrote: > On Sat, Oct 08, 2016 at 06:42:28PM +0100, Rostislav Pehlivanov wrote: > > Performance improvements: > > > > quant_bands: > > with: 681 decicycles in quant_bands, 8388453 runs,155 skips > > without: 1190 decicycles in quant_bands, 83

Re: [FFmpeg-devel] [PATCH v2] aacenc: add SIMD optimizations for abs_pow34 and quantization

2016-10-09 Thread Henrik Gramner
On Sun, Oct 9, 2016 at 2:15 PM, Rostislav Pehlivanov wrote: > +cglobal aac_quantize_bands, 6, 6, 6, out, in, scaled, size, is_signed, > maxval, Q34, rounding Now that this function is SSE2 you should explicitly use floating-point instructions to avoid bypass delays from transitioning between int

Re: [FFmpeg-devel] [PATCH v2] aacenc: add SIMD optimizations for abs_pow34 and quantization

2016-10-09 Thread Michael Niedermayer
On Sun, Oct 09, 2016 at 01:15:44PM +0100, Rostislav Pehlivanov wrote: > On 9 October 2016 at 03:18, Michael Niedermayer > wrote: > > > On Sat, Oct 08, 2016 at 06:42:28PM +0100, Rostislav Pehlivanov wrote: > > > Performance improvements: > > > > > > quant_bands: > > > with: 681 decicycles in q

Re: [FFmpeg-devel] [PATCH v2] aacenc: add SIMD optimizations for abs_pow34 and quantization

2016-10-09 Thread Henrik Gramner
On Sun, Oct 9, 2016 at 5:04 PM, Michael Niedermayer wrote: > this segfaults on x86-32 I'm guessing due to unaligned local arrays in search_for_ms(): float M[128], S[128]; ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/li