[FFmpeg-devel] [PATCH] RFC: v210enc optimisations and initial AVX-512

2022-10-20 Thread Kieran Kunhya
Hi, Please see attached an attempt to optimise the 8-bit input to v210enc to reduce the number of shuffles. This comes at the cost of having to extract the middle element and perform a DWORD shift on it and then reinserting it. I have added a few comments but any other ideas are welcome. Crude be

Re: [FFmpeg-devel] [PATCH] RFC: v210enc optimisations and initial AVX-512

2022-10-21 Thread Henrik Gramner
On Fri, Oct 21, 2022 at 5:41 AM Kieran Kunhya wrote: > > Hi, > > Please see attached an attempt to optimise the 8-bit input to v210enc to > reduce the number of shuffles. > This comes at the cost of having to extract the middle element and perform > a DWORD shift on it and then reinserting it. > I

Re: [FFmpeg-devel] [PATCH] RFC: v210enc optimisations and initial AVX-512

2022-10-26 Thread James Darnley
I guess it could also be scaled to ymm if you're a big Skylake fan :P (in which case you'd probably want to reorder the shuffle indices so