On Fri, Jun 25, 2021 at 1:26 PM Ronald S. Bultje <rsbul...@gmail.com> wrote:

> Hi Alan,
>
> On Fri, Jun 25, 2021 at 3:59 AM Alan Kelly <
> alankelly-at-google....@ffmpeg.org> wrote:
>
>> These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available.
>>
>
> Re-asking a question I asked before in the other thread:
>
> Also, what is the cycle count of ssse3/avx2 implementation for this
> specific function on Haswell? It would be good to note that in the
> respective patch so that we understand why the check was added.
>
> You should be able to find this in the checkasm --bench --test=X numbers
> for this relevant function.
>
> Ronald
>

Hi Ronald,

Skylake Haswell
hscale_8_to_15_width4_ssse3 761.2 760
hscale_8_to_15_width4_avx2 468.7 957
hscale_8_to_15_width8_ssse3 1170.7 1032
hscale_8_to_15_width8_avx2 865.7 1979
hscale_8_to_15_width12_ssse3 2172.2 2472
hscale_8_to_15_width12_avx2 1245.7 2901
hscale_8_to_15_width16_ssse3 2244.2 2400
hscale_8_to_15_width16_avx2 1647.2 3681

As you can see, it is catastrophic on Haswell. In the next iteration of the
patch, I will update the description with these numbers.

Thanks
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to