Hi,
On 2021-01-08 21:36, reimar.doeffin...@gmx.de wrote:
From: Reimar Döffinger <reimar.doeffin...@gmx.de>
Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth
available on aarch64.
For a UHD HDR (10 bit) sample video these were consuming the most time
and this optimization reduced overall decode time from 19.4s to 16.4s,
approximately 15% speedup.
Test sample was the first 300 frames of "LG 4K HDR Demo - New York.ts",
running on Apple M1.
---
libavcodec/aarch64/Makefile | 2 +
libavcodec/aarch64/hevcdsp_idct_neon.S | 426 ++++++++++++++++++++++
libavcodec/aarch64/hevcdsp_init_aarch64.c | 45 +++
libavcodec/hevcdsp.c | 2 +
libavcodec/hevcdsp.h | 1 +
5 files changed, 476 insertions(+)
create mode 100644 libavcodec/aarch64/hevcdsp_idct_neon.S
create mode 100644 libavcodec/aarch64/hevcdsp_init_aarch64.c
[...]
AS libavcodec/aarch64/hevcdsp_idct_neon.o
libavcodec/aarch64/hevcdsp_idct_neon.S: Assembler messages:
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch --
`mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.16b, v28.16b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch --
`mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.16b, v28.16b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch --
`mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.16b, v28.16b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch --
`mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.16b, v28.16b
This doesn't build on GNU assembler (GNU Binutils for Ubuntu) 2.34
(aarch64). Thanks for porting this, I was in the process of writing HEVC
assembly (see my set on the ML) and would be interested to rebase this
on top of that set.
--
Josh
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".