On Mon, 26 Sep 2022, Grzegorz Bernacki wrote:

Add vectorized implementation of nsse8 function.

Performance comparison tests are shown below.
- nsse_1_c: 256.0
- nsse_1_neon: 82.7

Benchmarks and tests run with checkasm tool on AWS Graviton 3.

Signed-off-by: Grzegorz Bernacki <g...@semihalf.com>
---
libavcodec/aarch64/me_cmp_init_aarch64.c | 15 ++++
libavcodec/aarch64/me_cmp_neon.S         | 99 ++++++++++++++++++++++++
2 files changed, 114 insertions(+)

Looks reasonable to me, but do check to make sure there's no tabs.

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to