On Thu, 23 Jun 2022, J. Dekker wrote:

hevc_add_res_4x4_12_c: 46.0
hevc_add_res_4x4_12_neon: 18.7
hevc_add_res_8x8_12_c: 194.7
hevc_add_res_8x8_12_neon: 25.2
hevc_add_res_16x16_12_c: 716.0
hevc_add_res_16x16_12_neon: 69.7
hevc_add_res_32x32_12_c: 3820.7
hevc_add_res_32x32_12_neon: 261.0

Signed-off-by: J. Dekker <j...@itanimul.li>
---
libavcodec/aarch64/hevcdsp_idct_neon.S    | 148 ++++++++++++----------
libavcodec/aarch64/hevcdsp_init_aarch64.c |  34 ++---
2 files changed, 97 insertions(+), 85 deletions(-)

LGTM. The patch is a bit hard to inspect thoroughly (to see exactly how little has changed) due to the functions being moved around at the same time as they're modified, but I checked and the changes do look fine.

By splitting things up in individual macros for each function, (e.g. add_res_4x4, add_res_8x8 etc, then add_res setting the mask and calling the others) you could keep the code in place and make the diff even easier to read, but it's not strictly necessary.

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to