subject:"\"\\\\\\\[FFmpeg\\\\\\\-devel\\\\\\\] \\\\\\\[PATCH\\\\\\\] libavcodec\\\\\\\/hevcdsp\\\\\\\: port SIMD idct functions from 32\\\\\\\-bit.\""

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

2021-02-11 Thread Reimar Döffinger

Hi Martin! > On 10 Feb 2021, at 22:53, Martin Storsjö wrote: > > +.macro idct_16x16 bitdepth > +function ff_hevc_idct_16x16_\bitdepth\()_neon, export=1 > +//r0 - coeffs > +mov x15, lr > + Binutils doesn't recognize "lr" as alias for x30 >>> It didn’t

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

2021-02-10 Thread Martin Storsjö

Hi Reimar, On Sat, 16 Jan 2021, Martin Storsjö wrote: +.macro idct_16x16 bitdepth +function ff_hevc_idct_16x16_\bitdepth\()_neon, export=1 +//r0 - coeffs +mov x15, lr + Binutils doesn't recognize "lr" as alias for x30 It didn’t have an issue in the Debian unstable VM? Th

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

2021-01-16 Thread Martin Storsjö

On Sat, 16 Jan 2021, Reimar Döffinger wrote: On 15 Jan 2021, at 23:55, Martin Storsjö wrote: On Tue, 12 Jan 2021, reimar.doeffin...@gmx.de wrote: create mode 100644 libavcodec/aarch64/hevcdsp_idct_neon.S create mode 100644 libavcodec/aarch64/hevcdsp_init_aarch64.c This patch fails checkasm

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

2021-01-15 Thread Reimar Döffinger

> On 15 Jan 2021, at 23:55, Martin Storsjö wrote: > > On Tue, 12 Jan 2021, reimar.doeffin...@gmx.de wrote: > >> create mode 100644 libavcodec/aarch64/hevcdsp_idct_neon.S >> create mode 100644 libavcodec/aarch64/hevcdsp_init_aarch64.c > > This patch fails checkasm Fixed, one mis-translated co

[FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

2021-01-15 Thread Reimar . Doeffinger

From: Reimar Döffinger Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth available on aarch64. For a UHD HDR (10 bit) sample video these were consuming the most time and this optimization reduced overall decode time from 19.4s to 16.4s, approximately 15% speedup. Test sample was the

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

2021-01-15 Thread Martin Storsjö

On Tue, 12 Jan 2021, reimar.doeffin...@gmx.de wrote: From: Reimar Döffinger Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth available on aarch64. For a UHD HDR (10 bit) sample video these were consuming the most time and this optimization reduced overall decode time from 19.4s

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

2021-01-13 Thread Martin Storsjö

On Tue, 12 Jan 2021, Reimar Döffinger wrote: On 12 Jan 2021, at 13:24, Josh Dekker wrote: Hi, AS libavcodec/aarch64/hevcdsp_idct_neon.o libavcodec/aarch64/hevcdsp_idct_neon.S: Assembler messages: libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch -- `mov v29.4S,v28.4S'

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

2021-01-12 Thread Reimar Döffinger

> On 12 Jan 2021, at 13:24, Josh Dekker wrote: > > Hi, > > On 2021-01-08 21:36, reimar.doeffin...@gmx.de wrote: >> From: Reimar Döffinger >> Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth >> available on aarch64. >> For a UHD HDR (10 bit) sample video these were consuming the

[FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

2021-01-12 Thread Reimar . Doeffinger

From: Reimar Döffinger Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth available on aarch64. For a UHD HDR (10 bit) sample video these were consuming the most time and this optimization reduced overall decode time from 19.4s to 16.4s, approximately 15% speedup. Test sample was the

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

2021-01-12 Thread Josh Dekker

Hi, On 2021-01-08 21:36, reimar.doeffin...@gmx.de wrote: From: Reimar Döffinger Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth available on aarch64. For a UHD HDR (10 bit) sample video these were consuming the most time and this optimization reduced overall decode time from 19

[FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

2021-01-08 Thread Reimar . Doeffinger

From: Reimar Döffinger Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth available on aarch64. For a UHD HDR (10 bit) sample video these were consuming the most time and this optimization reduced overall decode time from 19.4s to 16.4s, approximately 15% speedup. Test sample was the

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

[FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

[FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

Re: [FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

[FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.

11 matches

Site Navigation

Mail list logo

Footer information