Re: [FFmpeg-devel] [PATCH] libavcodec Adding ff_v210_planar_unpack AVX2

2019-03-27 Thread James Darnley
On 2019-03-26 21:22, Mike Stoner via ffmpeg-devel wrote: > Hello, > I’ve accounted for all feedback on this so far, I’m wondering if it is ready > to be pushed upstream? > > Here are my results from ‘checkasm’ (lower is better): > > v210_unpack_c: 1636 > v210_unpack_ssse3: 611 > v210_unpack_avx:

Re: [FFmpeg-devel] [PATCH] libavcodec Adding ff_v210_planar_unpack AVX2

2019-03-26 Thread Mike Stoner via ffmpeg-devel
Hello, I’ve accounted for all feedback on this so far, I’m wondering if it is ready to be pushed upstream? Here are my results from ‘checkasm’ (lower is better): v210_unpack_c: 1636 v210_unpack_ssse3: 611 v210_unpack_avx: 601 v210_unpack_avx2: 423 I ran it 5 times and averaged the middle 3 resu

Re: [FFmpeg-devel] [PATCH] libavcodec Adding ff_v210_planar_unpack AVX2

2019-03-16 Thread Mike Stoner
Hello, I resent my AVX2 patch for v210 unpacking.  My first attempt didn't get picked up by the Patchwork list for some reason. I installed Linux on a Broadwell laptop to utilize James Darnley's checkasm patch for v210 decode.  The results are below.   AVX2 gets a nice boost from replacing SHUF

[FFmpeg-devel] [PATCH] libavcodec Adding ff_v210_planar_unpack AVX2

2019-03-16 Thread Michael Stoner
Replaced VSHUFPS with VPBLENDD to relieve port 5 bottleneck AVX2 is 1.4x faster than AVX --- libavcodec/v210dec.c | 10 +- libavcodec/x86/v210-init.c | 8 + libavcodec/x86/v210.asm| 72 +- 3 files changed, 73 insertions(+), 17 deletions(-) di