Hello,
I resent my AVX2 patch for v210 unpacking.  My first attempt didn't get picked 
up by the Patchwork list for some reason.

I installed Linux on a Broadwell laptop to utilize James Darnley's checkasm 
patch for v210 decode.  The results are below.  

AVX2 gets a nice boost from replacing SHUFPS instructions with VPBLENDD, which 
has more flexible port bindings.  VBLENDPS could also be substituted and is 
available from SSE4.1 onward, however I found only the AVX2 code received any 
measureable gain from that change.

Any further comments are greatly appreciated.  

Thanks,
Mike


Tested on Broadwell CPU, Ubuntu 18.10 x86_64

~/FFmpeg$ tests/checkasm/checkasm --bench --test=v210dec
benchmarking with native FFmpeg timers
nop: 94.1
checkasm: using random seed 3963743306
SSSE3:
 - v210dec.v210_unpack [OK]
AVX:
 - v210dec.v210_unpack [OK]
AVX2:
 - v210dec.v210_unpack [OK]
checkasm: all 3 tests passed
v210_unpack_c: 1625.2
v210_unpack_ssse3: 604.2
v210_unpack_avx: 592.2
v210_unpack_avx2: 422.2
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to