from:"Stone Chen"

[FFmpeg-devel] [PATCH 1/3][GSoC 2024] libavcodec/vvc: convert (sad) to (sad[6]) to prepare for AVX2 funcs

2024-05-01 Thread Stone Chen

To prepare for adding AVX2 functions for different block widths, change VVCInterDSPContext to contain (*sad[6]) instead of (*sad). This also default initializes the pointer array with the scalar function and the calling sites to jump to the correct function based on block width. There's no chang

[FFmpeg-devel] [PATCH 2/3][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-01 Thread Stone Chen

ull +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,193 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpeg is free software; you can redistribute it and/or +; * modify it under the terms of the G

[FFmpeg-devel] [PATCH 3/3][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-01 Thread Stone Chen

Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8_16bpc_c: 112.5 vvc_sad_8_16bpc_avx2: 2.5 vvc_sad_16_16bpc_c: 232.5 vvc_sad_16_16bpc_avx2: 22.5 vvc_sad_32_16bpc_c: 912.5 vvc_sad_32_16bpc_avx2: 82.5 vvc_sad_64_16bpc_c: 3582.5 vvc_sad_64_16bpc_avx2: 392.5 vvc_sad_1

Re: [FFmpeg-devel] [PATCH 1/3][GSoC 2024] libavcodec/vvc: convert (sad) to (sad[6]) to prepare for AVX2 funcs

2024-05-06 Thread Stone Chen

On Wed, May 1, 2024 at 6:59 PM Andreas Rheinhardt < andreas.rheinha...@outlook.com> wrote: > Stone Chen: > > To prepare for adding AVX2 functions for different block widths, change > VVCInterDSPContext to contain (*sad[6]) instead of (*sad). This also > default initializes t

[FFmpeg-devel] [PATCH v2 1/2][GSoC] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-11 Thread Stone Chen

1184c731c --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,155 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpeg is free software; you can redistribute it and/or +; * modify it

[FFmpeg-devel] [PATCH v2 2/2][GSoC 2024] Terminal tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-11 Thread Stone Chen

Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 63.0 vvc_sad_8x8_avx2: 3.0 vvc_sad_16x16_c: 263.0 vvc_sad_16x16_avx2: 23.0 vvc_sad_32x32_c: 1003.0 vvc_sad_32x32_avx2: 83.0 vvc_sad_64x64_c: 3923.0 vvc_sad_64x64_avx2: 373.0 vvc_sad_128x128_c: 17533.0 vvc_sad_

[FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-14 Thread Stone Chen

c/x86/vvc/vvc_sad.asm new file mode 100644 index 00..530142ad35 --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,157 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpe

[FFmpeg-devel] [PATCH v3 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-14 Thread Stone Chen

Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 63.0 vvc_sad_8x8_avx2: 3.0 vvc_sad_16x16_c: 263.0 vvc_sad_16x16_avx2: 23.0 vvc_sad_32x32_c: 1003.0 vvc_sad_32x32_avx2: 83.0 vvc_sad_64x64_c: 3923.0 vvc_sad_64x64_avx2: 373.0 vvc_sad_128x128_c: 17533.0 vvc_sad_

Re: [FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-18 Thread Stone Chen

On Sat, May 18, 2024 at 9:04 AM Ronald S. Bultje wrote: > Hi, > > On Tue, May 14, 2024 at 4:40 PM Stone Chen > wrote: > >> Implements AVX2 DMVR (decoder-side motion vector refinement) SAD >> functions. DMVR SAD is only calculated if w >= 8, h >= 8, and w * h &g

Re: [FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-19 Thread Stone Chen

On Sat, May 18, 2024 at 11:33 AM Ronald S. Bultje wrote: > Hi, > > On Tue, May 14, 2024 at 4:40 PM Stone Chen > wrote: > >> +vvc_sad_8: >> +.loop_height: >> +movu xm0, [src1q] >> +movu xm1, [src2q] >

[FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-19 Thread Stone Chen

codec/x86/vvc/vvc_sad.asm new file mode 100644 index 00..58a24635d2 --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,138 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpe

[FFmpeg-devel] [PATCH v4 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-19 Thread Stone Chen

Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 70.0 vvc_sad_8x8_avx2: 10.0 vvc_sad_16x16_c: 280.0 vvc_sad_16x16_avx2: 20.0 vvc_sad_32x32_c: 1020.0 vvc_sad_32x32_avx2: 70.0 vvc_sad_64x64_c: 3560.0 vvc_sad_64x64_avx2: 270.0 vvc_sad_128x128_c: 13760.0 vvc_sad

[FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-19 Thread Stone Chen

codec/x86/vvc/vvc_sad.asm new file mode 100644 index 00..58a24635d2 --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,138 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpe

[FFmpeg-devel] [PATCH v4 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-19 Thread Stone Chen

Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 70.0 vvc_sad_8x8_avx2: 10.0 vvc_sad_16x16_c: 280.0 vvc_sad_16x16_avx2: 20.0 vvc_sad_32x32_c: 1020.0 vvc_sad_32x32_avx2: 70.0 vvc_sad_64x64_c: 3560.0 vvc_sad_64x64_avx2: 270.0 vvc_sad_128x128_c: 13760.0 vvc_sad

[FFmpeg-devel] [PATCH v5 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-21 Thread Stone Chen

codec/x86/vvc/vvc_sad.asm new file mode 100644 index 00..9766446b11 --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,130 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpeg is

[FFmpeg-devel] [PATCH v5 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-21 Thread Stone Chen

Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 50.3 vvc_sad_8x8_avx2: 0.3 vvc_sad_16x16_c: 250.3 vvc_sad_16x16_avx2: 10.3 vvc_sad_32x32_c: 1020.3 vvc_sad_32x32_avx2: 60.3 vvc_sad_64x64_c: 3850.3 vvc_sad_64x64_avx2: 220.3 vvc_sad_128x128_c: 14100.3 vvc_sad_

Re: [FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-21 Thread Stone Chen

On Mon, May 20, 2024 at 7:23 AM Ronald S. Bultje wrote: > Hi, > > This is mostly good, the following is tiny nitpicks. > > On Sun, May 19, 2024 at 8:46 PM Stone Chen > wrote: > >> +%macro INIT_OFFSET 6 ; src1, src2, dxq, dyq, off1, off2 >> > > The macro is

Re: [FFmpeg-devel] [PATCH v5 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-23 Thread Stone Chen

On Thu, May 23, 2024 at 9:18 AM Nuo Mi wrote: > On Thu, May 23, 2024 at 7:38 AM James Almer wrote: > > > On 5/21/2024 10:01 PM, Ronald S. Bultje wrote: > > > Hi, > > > > > > On Tue, May 21, 2024 at 8:01 PM Stone Chen > > wrote: > > > &

[FFmpeg-devel] [PATCH v1 1/2][GSoC 2024] libavcode/x86/vvc: change label to vvc_sad_16 to reflect block sizes

2024-05-28 Thread Stone Chen

According to the VVC specification (section 8.5.1), the maximum width/height of a subblock passed for DMVR SAD is 16. This along with previous constraint requiring width * height >= 128 means that 8x16, 16x8, and 16x16 are the only allowed sizes. This re-labels vvc_sad_16_128 to vvc_sad_16 to r

[FFmpeg-devel] [PATCH v1 2/2][GSoC 2024] tests/checkasm/vvc_mc: for SAD, only test valid subblock sizes

2024-05-28 Thread Stone Chen

According to the VVC specification (section 8.5.1), the maximum width/height of a subblock passed for DMVR SAD is 16. This along with previous constraint requiring width * height >= 128 means that 8x16, 16x8, and 16x16 are the only allowed sizes. This changes check_vvc_sad() to only test and b

[FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

2024-02-18 Thread Stone Chen

In commit 6c45d34, a line was added that always sets rdiv to 0, overriding any user input. This removes that line allowing user set values for 0rdiv, 1rdiv, 2rdiv, 3rdiv to apply as expected. This fixes ticket #10294. Signed-off-by: Stone Chen --- libavfilter/vf_convolution.c | 1 - 1 file

Re: [FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

2024-02-18 Thread Stone Chen

Sorry I think I didn't correctly attach the patch the first time. On Sun, Feb 18, 2024 at 2:21 PM Stone Chen wrote: > In commit 6c45d34, a line was added that always sets rdiv to 0, overriding > any user input. This removes that line allowing user set values for 0rdiv, > 1rdiv, 2

Re: [FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

2024-02-18 Thread Stone Chen

Hi Marton, Thanks for the feedback! I'm not sure what dynamic reconfiguration is, from some searching I think it might be related to commands? On Sun, Feb 18, 2024 at 7:08 PM Marton Balint wrote: > > > On Sun, 18 Feb 2024, Stone Chen wrote: > > > In commit 6c45d34

Re: [FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

2024-02-18 Thread Stone Chen

Hi Marton, Thanks for the feedback! I'm not sure what dynamic reconfiguration is, from some searching I think it might be related to commands? On Sun, Feb 18, 2024 at 7:08 PM Marton Balint wrote: > > > On Sun, 18 Feb 2024, Stone Chen wrote: > > > In commit 6c45d34

[FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

2024-02-24 Thread Stone Chen

Previously to support dynamic reconfigurations of the matrix string (e.g. 0m), the rdiv values would always be cleared to 0.f, causing the rdiv to be recalculated based on the new filter. This however had the side effect of always ignoring user specified rdiv values. Instead float user_rdiv[0]

Re: [FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

2024-02-24 Thread Stone Chen

Sorry I just realized I messed up my git commit (new to git), I've attached a patch file with that correction. On Sat, Feb 24, 2024 at 10:49 AM Stone Chen wrote: > Previously to support dynamic reconfigurations of the matrix string (e.g. > 0m), the rdiv values would always be cle

Re: [FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

2024-02-24 Thread Stone Chen

On Sat, Feb 24, 2024 at 3:56 PM Marton Balint wrote: > > > On Sat, 24 Feb 2024, Stone Chen wrote: > > > Previously to support dynamic reconfigurations of the matrix string > (e.g. 0m), the rdiv values would always be cleared to 0.f, causing the rdiv > to be recalculated

Re: [FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

2024-02-24 Thread Stone Chen

On Sat, Feb 24, 2024 at 6:34 PM Marton Balint wrote: > > > On Sat, 24 Feb 2024, Stone Chen wrote: > > > On Sat, Feb 24, 2024 at 3:56 PM Marton Balint wrote: > > > >> > >> > >> On Sat, 24 Feb 2024, Stone Chen wrote: > >> > >>

[FFmpeg-devel] [PATCH] Change rdiv (vf_convolution) documentation to reflect actual behavior

2024-03-12 Thread Stone Chen

The documentation correctly states that the rdiv is a multiplier but incorrectly states the default behavior is to multiply by the sum of all matrix elements - it multiplies by 1/sum. This changes the documentation to match the code. --- doc/filters.texi | 2 +- 1 file changed, 1 insertion(+),

Re: [FFmpeg-devel] [PATCH] Change rdiv (vf_convolution) documentation to reflect actual behavior

2024-03-13 Thread Stone Chen

On Wed, Mar 13, 2024 at 4:26 AM Marton Balint wrote: > > > On Tue, 12 Mar 2024, Stone Chen wrote: > > > The documentation correctly states that the rdiv is a multiplier but > incorrectly states the default behavior is to multiply by the sum of all > matrix elements

[FFmpeg-devel] [PATCH] doc/filters: Change rdiv (vf_convolution) documentation to reflect actual behavior

2024-03-14 Thread Stone Chen

The documentation correctly states that the rdiv is a multiplier but incorrectly states the default behavior is to multiply by the sum of all matrix elements - it multiplies by 1/sum. This changes the documentation to match the code. Address trac #10889 --- doc/filters.texi | 2 +- 1 file chan

[FFmpeg-devel] [PATCH 1/3][GSoC 2024] libavcodec/vvc: convert (sad) to (sad[6]) to prepare for AVX2 funcs

[FFmpeg-devel] [PATCH 2/3][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

[FFmpeg-devel] [PATCH 3/3][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

Re: [FFmpeg-devel] [PATCH 1/3][GSoC 2024] libavcodec/vvc: convert (sad) to (sad[6]) to prepare for AVX2 funcs

[FFmpeg-devel] [PATCH v2 1/2][GSoC] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

[FFmpeg-devel] [PATCH v2 2/2][GSoC 2024] Terminal tests/checkasm: Add check_vvc_sad to vvc_mc.c

[FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

[FFmpeg-devel] [PATCH v3 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

Re: [FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

Re: [FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

[FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

[FFmpeg-devel] [PATCH v4 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

[FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

[FFmpeg-devel] [PATCH v4 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

[FFmpeg-devel] [PATCH v5 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

[FFmpeg-devel] [PATCH v5 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

Re: [FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

Re: [FFmpeg-devel] [PATCH v5 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

[FFmpeg-devel] [PATCH v1 1/2][GSoC 2024] libavcode/x86/vvc: change label to vvc_sad_16 to reflect block sizes

[FFmpeg-devel] [PATCH v1 2/2][GSoC 2024] tests/checkasm/vvc_mc: for SAD, only test valid subblock sizes

[FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

Re: [FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

Re: [FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

Re: [FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

[FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

Re: [FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

Re: [FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

Re: [FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

[FFmpeg-devel] [PATCH] Change rdiv (vf_convolution) documentation to reflect actual behavior

Re: [FFmpeg-devel] [PATCH] Change rdiv (vf_convolution) documentation to reflect actual behavior

[FFmpeg-devel] [PATCH] doc/filters: Change rdiv (vf_convolution) documentation to reflect actual behavior

31 matches

Site Navigation

Mail list logo

Footer information