Hi,

On 2021-01-08 21:36, reimar.doeffin...@gmx.de wrote:
From: Reimar Döffinger <reimar.doeffin...@gmx.de>

Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth
available on aarch64.
For a UHD HDR (10 bit) sample video these were consuming the most time
and this optimization reduced overall decode time from 19.4s to 16.4s,
approximately 15% speedup.
Test sample was the first 300 frames of "LG 4K HDR Demo - New York.ts",
running on Apple M1.
---
  libavcodec/aarch64/Makefile               |   2 +
  libavcodec/aarch64/hevcdsp_idct_neon.S    | 426 ++++++++++++++++++++++
  libavcodec/aarch64/hevcdsp_init_aarch64.c |  45 +++
  libavcodec/hevcdsp.c                      |   2 +
  libavcodec/hevcdsp.h                      |   1 +
  5 files changed, 476 insertions(+)
  create mode 100644 libavcodec/aarch64/hevcdsp_idct_neon.S
  create mode 100644 libavcodec/aarch64/hevcdsp_init_aarch64.c

[...]

AS      libavcodec/aarch64/hevcdsp_idct_neon.o
libavcodec/aarch64/hevcdsp_idct_neon.S: Assembler messages:
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch -- `mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:       mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:       mov v29.16b, v28.16b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch -- `mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:       mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:       mov v29.16b, v28.16b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch -- `mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:       mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:       mov v29.16b, v28.16b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch -- `mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:       mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:    other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info:       mov v29.16b, v28.16b

This doesn't build on GNU assembler (GNU Binutils for Ubuntu) 2.34 (aarch64). Thanks for porting this, I was in the process of writing HEVC assembly (see my set on the ML) and would be interested to rebase this on top of that set.

--
Josh
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to