Re: [FFmpeg-devel] [PATCH v2 0/7] arm64 neon implementation for 8bits functions

Martin Storsjö Tue, 04 Oct 2022 03:57:15 -0700

On Mon, 3 Oct 2022, Grzegorz Bernacki wrote:

Changes since v1:


- changed tabs to spaces
- modified branch instruction in vsse8

- apply Martin's patches with improved instructions scheduling

Grzegorz Bernacki (4):
 lavc/aarch64: Add neon implementation for pix_abs8 functions.
 lavc/aarch64: Provide neon implementation of nsse8
 lavc/aarch64: Provide optimized implementation of vsse8 for arm64.
 lavc/aarch64: Add neon implementation for vsse_intra8

Martin Storsjö (3):
 aarch64: me_cmp: Improve scheduling in ff_pix_abs8_y2_neon
 aarch64: me_cmp: Fix up the prologue of ff_pix_abs8_xy2_neon
 aarch64: me_cmp: Improve scheduling in vsse_intra8

libavcodec/aarch64/me_cmp_init_aarch64.c |  33 ++
libavcodec/aarch64/me_cmp_neon.S         | 414 +++++++++++++++++++++++
2 files changed, 447 insertions(+)


Thanks! This mostly looked good to me.

I had actually meant that you would squash my fixes into your patches,instead of keeping them as separate ones.

After squashing such changes, it might have been interesting to getupdated benchmarks in those commit messages (the ones that you have fromGraviton 3). However in this case, these changes didn't really make muchdifference on out-of-order cores, only on in-order cores, so I guessthere's not that much value in getting updated benchmarks from Graviton 3in this case.

So I went ahead and squashed those patches (and added co-authored-by lineswhere relevant), and pushed them now. Thanks for your contribution!


// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2 0/7] arm64 neon implementation for 8bits functions

Reply via email to