Re: [FFmpeg-devel] [PATCH v2 1/4] avfilter/af_volumedetect.c: Move logdb function

2024-06-29 Thread Rémi Denis-Courmont
, AVFrame > *samples) return ff_filter_frame(inlink->dst->outputs[0], samples); > } > > -#define MAX_DB 91 > - > -static inline double logdb(uint64_t v) > -{ > -double d = v / (double)(0x8000 * 0x8000); > -if (!v) > - return MAX_DB; > -

[FFmpeg-devel] [PATCH 2/2] lavc/h264dsp: R-V V 8-bit luma loop filter

2024-06-30 Thread Rémi Denis-Courmont
_filter_luma_8_rvv; +} dsp->startcode_find_candidate = ff_startcode_find_candidate_rvv; +} # endif #endif } diff --git a/libavcodec/riscv/h264dsp_rvv.S b/libavcodec/riscv/h264dsp_rvv.S new file mode 100644 index 00..ea9dfb1a7e --- /dev/null +++ b/libavcodec/riscv/h264dsp_rvv.S @@ -0,0 +1,136 @@

[FFmpeg-devel] [PATCH 1/2] lavc/vc1dsp: fix potential overflow in R-V V inv_trans_4

2024-06-30 Thread Rémi Denis-Courmont
Judging by the coefficients, the last round of add/sub can overflow to 17 bits with a very small probability just as with the 8-point transform. This is not observed under FATE, but better safe than sorry. --- libavcodec/riscv/vc1dsp_rvv.S | 15 --- 1 file changed, 8 insertions(+), 7 d

Re: [FFmpeg-devel] [PATCH 2/2] lavc/h264dsp: R-V V 8-bit luma loop filter

2024-06-30 Thread Rémi Denis-Courmont
Disregard, botched send-email. -- レミ・デニ-クールモン http://www.remlab.net/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with

[FFmpeg-devel] [PATCH 1/2] lavc/h264dsp: R-V V 8-bit luma loop filter

2024-06-30 Thread Rémi Denis-Courmont
_filter_luma_8_rvv; +} dsp->startcode_find_candidate = ff_startcode_find_candidate_rvv; +} # endif #endif } diff --git a/libavcodec/riscv/h264dsp_rvv.S b/libavcodec/riscv/h264dsp_rvv.S new file mode 100644 index 00..ea9dfb1a7e --- /dev/null +++ b/libavcodec/riscv/h264dsp_rvv.S @@ -0,0 +1,136 @@

[FFmpeg-devel] [PATCH 2/2] lavc/h264dsp: R-V V 8-bit MBAFF loop filter

2024-06-30 Thread Rémi Denis-Courmont
Performance is (unfortunately) the same as with non-MBAFF, since the hardware under test does not short-circuit vector tail calculations. (IMO, a generic solution or work-around should be agreed on, rather than bespoke approaches all over the place.) --- libavcodec/riscv/h264dsp_init.c | 4

[FFmpeg-devel] [PATCH 1/2] lavc/vc1dsp: fuse multiply-adds in R-V V inv_trans_4

2024-06-30 Thread Rémi Denis-Courmont
--- libavcodec/riscv/vc1dsp_rvv.S | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/libavcodec/riscv/vc1dsp_rvv.S b/libavcodec/riscv/vc1dsp_rvv.S index 9d85377cec..8c127c7644 100644 --- a/libavcodec/riscv/vc1dsp_rvv.S +++ b/libavcodec/riscv/vc1dsp_rvv.S @@ -194,14 +194

[FFmpeg-devel] [PATCH 2/2] lavc/vc1dsp: fuse multiply-adds in R-V V inv_trans_8

2024-06-30 Thread Rémi Denis-Courmont
--- libavcodec/riscv/vc1dsp_rvv.S | 63 +++ 1 file changed, 27 insertions(+), 36 deletions(-) diff --git a/libavcodec/riscv/vc1dsp_rvv.S b/libavcodec/riscv/vc1dsp_rvv.S index 8c127c7644..d8b62579aa 100644 --- a/libavcodec/riscv/vc1dsp_rvv.S +++ b/libavcodec/riscv/v

Re: [FFmpeg-devel] [PATCH 07/11] doc/examples/vaapi_encode: Try to check fwrite() for failure

2024-07-01 Thread Rémi Denis-Courmont
Le 1 juillet 2024 02:12:46 GMT+03:00, Michael Niedermayer a écrit : >Fixes: CID1604548 Unused value > >Sponsored-by: Sovereign Tech Fund >Signed-off-by: Michael Niedermayer >--- > doc/examples/vaapi_encode.c | 4 > 1 file changed, 4 insertions(+) > >diff --git a/doc/examples/vaapi_encode.c

Re: [FFmpeg-devel] [PATCH 06/13] avcodec/mpv_reconstruct_mb_template: Merge template into its users

2024-07-01 Thread Rémi Denis-Courmont
Le 1 juillet 2024 15:16:03 GMT+03:00, Andreas Rheinhardt a écrit : >A large part of this template is decoder-only. This makes >the complexity of the IS_ENCODER-checks not worth it. >So simply merge the template into both its users. > >Signed-off-by: Andreas Rheinhardt >--- > libavcodec/mpegvid

Re: [FFmpeg-devel] [PATCH v5] lavc/vvc_mc: R-V V avg w_avg

2024-07-01 Thread Rémi Denis-Courmont
to be a tail call instead. > > Will this cause any issues? It will execute at a label, and after > executing, there is a ret at the label. Yes. Tail calls should incur no Return Address Stack action. But this incurs a pop, as per the "Unconditional Jumps" termino

Re: [FFmpeg-devel] [PATCH 1/2] lavc/vc1dsp: fuse multiply-adds in R-V V inv_trans_4

2024-07-01 Thread Rémi Denis-Courmont
T-Head C908 (cycles):before after vc1dsp.vc1_inv_trans_4x4_rvv_i32: 128.0 120.0 vc1dsp.vc1_inv_trans_4x8_rvv_i32: 244.0 240.0 vc1dsp.vc1_inv_trans_8x4_rvv_i32: 239.2 235.2 -- レミ・デニ-クールモン http://www.remlab.net/ ___ ffmpeg-devel mai

Re: [FFmpeg-devel] [PATCH 2/2] lavc/vc1dsp: fuse multiply-adds in R-V V inv_trans_8

2024-07-01 Thread Rémi Denis-Courmont
T-Head C908 (cycles) before after vc1dsp.vc1_inv_trans_4x8_rvv_i32: 240.0 228.0 vc1dsp.vc1_inv_trans_8x4_rvv_i32: 235.2 224.2 vc1dsp.vc1_inv_trans_8x8_rvv_i32: 340.7 327.2 -- Rémi Denis-Courmont http://www.remlab.net/ ___ ffmpeg

[FFmpeg-devel] [PATCH 2/2] lavc/h264dsp: R-V V 8-bit MBAFF loop filter

2024-07-01 Thread Rémi Denis-Courmont
Performance is (unfortunately) the same as with non-MBAFF, since the hardware under test does not short-circuit vector tail calculations. (IMO, a generic solution or work-around should be agreed on, rather than bespoke approaches all over the place.) --- libavcodec/riscv/h264dsp_init.c | 4

[FFmpeg-devel] [PATCHv2 1/2] lavc/h264dsp: R-V V 8-bit luma loop filter

2024-07-01 Thread Rémi Denis-Courmont
S b/libavcodec/riscv/h264dsp_rvv.S new file mode 100644 index 00..77bf40db1f --- /dev/null +++ b/libavcodec/riscv/h264dsp_rvv.S @@ -0,0 +1,140 @@ +/* + * Copyright © 2024 Rémi Denis-Courmont. + * + * Redistribution and use in source and binary forms, with or without + * modification,

[FFmpeg-devel] [RFC] [PATCH 1/4] lavc/h264_loopfilter: expose tc0_table (for checkasm)

2024-07-01 Thread Rémi Denis-Courmont
--- libavcodec/h264_loopfilter.c | 50 ++-- libavcodec/h264dsp.h | 2 ++ 2 files changed, 27 insertions(+), 25 deletions(-) diff --git a/libavcodec/h264_loopfilter.c b/libavcodec/h264_loopfilter.c index c164a289b7..9481882dd0 100644 --- a/libavcodec/h264_l

[FFmpeg-devel] [PATCH 2/4] lavc/h264_loopfilter: align TC and bS tables

2024-07-01 Thread Rémi Denis-Courmont
--- libavcodec/h264_loopfilter.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/libavcodec/h264_loopfilter.c b/libavcodec/h264_loopfilter.c index 9481882dd0..96f572c1d2 100644 --- a/libavcodec/h264_loopfilter.c +++ b/libavcodec/h264_loopfilter.c @@ -66,7 +66,7 @@ static

[FFmpeg-devel] [PATCH 3/4] WIP: lavc/h264dsp: take over looking up TC values

2024-07-01 Thread Rémi Denis-Courmont
This moves the look-up of TC values from bS from the generic C loop filter code to the DSP functions. This (potentially) eliminates a round-trip to the stack for the looked-up values. This is work-in-progress. 8 functions need to be updated and this only updates one of them. Also updating the plat

[FFmpeg-devel] [PATCH 4/4] lavc/h264dsp: update R-V V intra luma loop filter

2024-07-01 Thread Rémi Denis-Courmont
Note that the performance reported by checkasm is slightly worse. This is expected since the assembler is now doing more work. --- libavcodec/riscv/h264dsp_init.c | 3 ++- libavcodec/riscv/h264dsp_rvv.S | 6 -- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/libavcodec/riscv/h2

[FFmpeg-devel] [PATCH 1/4] lavc/h263dsp: add DCT dequantisation functions

2024-07-01 Thread Rémi Denis-Courmont
Note that optimised implementations of these functions will be taken into actual use only if MpegEncContext.dct_unquantize_h263_{inter,intra} are *not* overloaded by existing optimisations. --- libavcodec/h263dsp.c | 25 + libavcodec/h263dsp.h | 4 2 files changed, 29

[FFmpeg-devel] [PATCH 2/4] lavc/mpegvideo: use H263DSP dequant function

2024-07-01 Thread Rémi Denis-Courmont
--- configure | 4 ++-- libavcodec/mpegvideo.c | 40 +--- 2 files changed, 11 insertions(+), 33 deletions(-) diff --git a/configure b/configure index fed4c44cd1..42b9a72d5a 100755 --- a/configure +++ b/configure @@ -2954,8 +2954,8 @@ ftr_decoder_s

[FFmpeg-devel] [PATCH 3/4] checkasm/h263dsp: test dct_unquantize_{intra, inter}

2024-07-01 Thread Rémi Denis-Courmont
--- tests/checkasm/h263dsp.c | 57 +++- 1 file changed, 56 insertions(+), 1 deletion(-) diff --git a/tests/checkasm/h263dsp.c b/tests/checkasm/h263dsp.c index 2d0957a90b..26020211dc 100644 --- a/tests/checkasm/h263dsp.c +++ b/tests/checkasm/h263dsp.c @@ -18,13

[FFmpeg-devel] [PATCH 4/4] lavc/h263dsp: R-V V dct_unquantize_{intra, inter}

2024-07-01 Thread Rémi Denis-Courmont
T-Head C908: h263dsp.dct_unquantize_inter_c: 3.7 h263dsp.dct_unquantize_inter_rvv_i32: 1.7 h263dsp.dct_unquantize_intra_c: 4.0 h263dsp.dct_unquantize_intra_rvv_i32: 1.5 SpacemiT X60: h263dsp.dct_unquantize_inter_c: 3.5 h263dsp.dct_unquantize_inter_rvv_i32: 1.2 h263dsp.dct_unquant

Re: [FFmpeg-devel] SWS cleanup / SPI Funding Suggestion

2023-10-17 Thread Rémi Denis-Courmont
part of the GA, and I have neither the expertise and credibility nor the time and motivation to take up this project, so that's just my free advice. -- Rémi Denis-Courmont http://www.remlab.net/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg

Re: [FFmpeg-devel] SWS cleanup / SPI Funding Suggestion

2023-10-18 Thread Rémi Denis-Courmont
Le keskiviikkona 18. lokakuuta 2023, 0.57.45 EEST Michael Niedermayer a écrit : > On Tue, Oct 17, 2023 at 09:50:41PM +0300, Rémi Denis-Courmont wrote: > > Le perjantaina 13. lokakuuta 2023, 22.19.34 EEST Michael Niedermayer a écrit : > > > But some goals would proba

Re: [FFmpeg-devel] [ANNOUNCE] upcoming GA vote

2023-10-25 Thread Rémi Denis-Courmont
Hi, I am not on the GA, but there are probably people with my locale on the GA. And it seems that the voting system hits a Perl syntax error if your browser locale is set to French. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.o

Re: [FFmpeg-devel] [ANNOUNCE] upcoming GA vote

2023-10-25 Thread Rémi Denis-Courmont
Hi, Le 25 octobre 2023 18:52:31 GMT+03:00, Thilo Borgmann via ffmpeg-devel a écrit : >Am 25.10.23 um 16:23 schrieb Thilo Borgmann via ffmpeg-devel: >> Am 25.10.23 um 16:22 schrieb Rémi Denis-Courmont: >>> Hi, >>> >>> I am not on the GA, but there are prob

[FFmpeg-devel] [PATCH] lavu/riscv: fix typo

2023-10-26 Thread Rémi Denis-Courmont
--- libavutil/riscv/cpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c index fa45c0ad83..460d3e9f91 100644 --- a/libavutil/riscv/cpu.c +++ b/libavutil/riscv/cpu.c @@ -67,7 +67,7 @@ int ff_get_cpu_flags_riscv(void) #endif

Re: [FFmpeg-devel] [RFC] financial sustainability Plan A (SPI)

2023-10-27 Thread Rémi Denis-Courmont
Le 26 octobre 2023 18:45:23 GMT+03:00, Michael Niedermayer a écrit : >This is financial sustainability Plan A (SPI) >ATM SPI has like 150k $, we do not activly seek donations, we do not currently >use SPI money to fund any development. SPI money is ultimately controlled by >the FFmpeg community

Re: [FFmpeg-devel] [RFC] financial sustainability Plan A (SPI)

2023-10-27 Thread Rémi Denis-Courmont
Hi, Le 27 octobre 2023 14:10:15 GMT+03:00, Thilo Borgmann via ffmpeg-devel a écrit : >> Le 26 octobre 2023 18:45:23 GMT+03:00, Michael Niedermayer >> a écrit : >>> This is financial sustainability Plan A (SPI) >>> ATM SPI has like 150k $, we do not activly seek donations, we do not >>> curren

Re: [FFmpeg-devel] [RFC] financial sustainability Plan A (SPI)

2023-10-27 Thread Rémi Denis-Courmont
Hi, Le perjantaina 27. lokakuuta 2023, 15.24.38 EEST Thilo Borgmann via ffmpeg- devel a écrit : > >>> Why should it be via SPI? What's the benefit of that hypothetical future additional funding going via SPI, as opposed to: > >> obviously transparency and community control. None of which is gi

Re: [FFmpeg-devel] [RFC] financial sustainability Plan A (SPI)

2023-10-27 Thread Rémi Denis-Courmont
has for the wishful thinking that it would kickstart sustainable financing for FFmpeg development. -- Rémi Denis-Courmont http://www.remlab.net/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To uns

Re: [FFmpeg-devel] [RFC] financial sustainability Plan A (SPI)

2023-10-27 Thread Rémi Denis-Courmont
while business contracts have business secrecy. Even here, I know at least one of my colleagues has applied to have their taxable income delisted on the basis of GDPR. -- Rémi Denis-Courmont http://www.remlab.net/ ___ ffmpeg-devel mailing list ffmpeg-d

Re: [FFmpeg-devel] [RFC] financial sustainability Plan A (SPI)

2023-10-27 Thread Rémi Denis-Courmont
Then that is nowhere near the level of labour-intensive (for the GA) and privacy-intrusive (for the consultants) that SPI funding would involve, more or less making my point. -- Rémi Denis-Courmont http://www.remlab.net/ ___ ffmpeg-devel mai

[FFmpeg-devel] [PATCH 2/6] lavc/pixblockdsp: aligned R-V V 8-bit functions

2023-10-27 Thread Rémi Denis-Courmont
If the scan lines are aligned, we can load each row as a 64-bit value, thus avoiding segmentation. And then we can factor the conversion or subtraction. In principle, the same optimisation should be possible for high depth, but would require 128-bit elements, for which no FFmpeg CPU flag exists. -

[FFmpeg-devel] [PATCH 1/6] lavc/pixblockdsp: rename unaligned R-V V functions

2023-10-27 Thread Rémi Denis-Courmont
--- libavcodec/riscv/pixblockdsp_init.c | 26 +++--- libavcodec/riscv/pixblockdsp_rvv.S | 6 +++--- 2 files changed, 18 insertions(+), 14 deletions(-) diff --git a/libavcodec/riscv/pixblockdsp_init.c b/libavcodec/riscv/pixblockdsp_init.c index aa39a8a665..8f24281217 100644

[FFmpeg-devel] [PATCH 3/6] lavc/idctdsp: require Zve64x for R-V V functions

2023-10-27 Thread Rémi Denis-Courmont
This will be required for the following changesets. --- libavcodec/riscv/idctdsp_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/riscv/idctdsp_init.c b/libavcodec/riscv/idctdsp_init.c index e6e616a555..4106d90c55 100644 --- a/libavcodec/riscv/idctdsp_init.c ++

[FFmpeg-devel] [PATCH 4/6] lavc/idctdsp: improve R-V V put_signed_pixels_clamped

2023-10-27 Thread Rémi Denis-Courmont
This follows the same idea as with pixblockdsp, but applied at the other end, whilst writing data at the end of the function. --- libavcodec/riscv/idctdsp_rvv.S | 27 +-- 1 file changed, 9 insertions(+), 18 deletions(-) diff --git a/libavcodec/riscv/idctdsp_rvv.S b/libavco

[FFmpeg-devel] [PATCH 5/6] lavc/idctdsp: improve R-V V add_pixels_clamped

2023-10-27 Thread Rémi Denis-Courmont
--- libavcodec/riscv/idctdsp_rvv.S | 28 ++-- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/libavcodec/riscv/idctdsp_rvv.S b/libavcodec/riscv/idctdsp_rvv.S index 4ff72f48d2..fafdddb174 100644 --- a/libavcodec/riscv/idctdsp_rvv.S +++ b/libavcodec/riscv/idct

[FFmpeg-devel] [PATCH 6/6] lavc/idctdsp: improve R-V V put_pixels_clamped

2023-10-27 Thread Rémi Denis-Courmont
--- libavcodec/riscv/idctdsp_rvv.S | 25 + 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/libavcodec/riscv/idctdsp_rvv.S b/libavcodec/riscv/idctdsp_rvv.S index fafdddb174..e93e6b5e7a 100644 --- a/libavcodec/riscv/idctdsp_rvv.S +++ b/libavcodec/riscv/idctdsp_

Re: [FFmpeg-devel] [PATCH 1/6] lavc/pixblockdsp: rename unaligned R-V V functions

2023-10-28 Thread Rémi Denis-Courmont
P.S.: It took some additional efforts to get some benchmarks with proto-RVV. But here they are: idctdsp.add_pixels_clamped_c: 259.5 idctdsp.add_pixels_clamped_rvv_i64: 90.5 idctdsp.put_pixels_clamped_c: 186.5 idctdsp.put_pixels_clamped_rvv_i64: 65.5 idctdsp.put_signed_pixels_clamped_c: 209.5 idct

[FFmpeg-devel] [PATCH 1/2] lavc/utvideodsp: R-V V restore_rgb_planes

2023-10-28 Thread Rémi Denis-Courmont
/utvideodsp_init.c b/libavcodec/riscv/utvideodsp_init.c new file mode 100644 index 00..dfaa16692a --- /dev/null +++ b/libavcodec/riscv/utvideodsp_init.c @@ -0,0 +1,38 @@ +/* + * Copyright © 2023 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute

[FFmpeg-devel] [PATCH 2/2] lavc/utvideodsp: R-V V restore_rgb_planes10

2023-10-28 Thread Rémi Denis-Courmont
restore_rgb_planes10_c: 185852.2 restore_rgb_planes10_rvv_i32: 90130.5 --- libavcodec/riscv/utvideodsp_init.c | 9 +++- libavcodec/riscv/utvideodsp_rvv.S | 35 ++ 2 files changed, 43 insertions(+), 1 deletion(-) diff --git a/libavcodec/riscv/utvideodsp_init.

[FFmpeg-devel] [PATCH] lavc/huffyuvdsp: R-V V add_int16

2023-10-28 Thread Rémi Denis-Courmont
/riscv/huffyuvdsp_init.c new file mode 100644 index 00..0f7bc4d692 --- /dev/null +++ b/libavcodec/riscv/huffyuvdsp_init.c @@ -0,0 +1,37 @@ +/* + * Copyright © 2023 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it und

Re: [FFmpeg-devel] [PATCH] lavc/huffyuvdsp: R-V V add_int16

2023-10-28 Thread Rémi Denis-Courmont
Le lauantaina 28. lokakuuta 2023, 16.56.40 EEST Rémi Denis-Courmont a écrit : > +#include "config.h" > +#include "libavutil/attributes.h" > +#include "libavutil/cpu.h" > +#include "libavcodec/huffyuvdsp.h" > + > +void ff_add_int16_r

[FFmpeg-devel] [PATCH 1/3] lavc/jpeg2000dsp: make coefficients extern

2023-10-28 Thread Rémi Denis-Courmont
This is so that they can be loaded from assembler, rather than duplicated. --- libavcodec/jpeg2000dsp.c | 3 ++- libavcodec/jpeg2000dsp.h | 2 ++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/libavcodec/jpeg2000dsp.c b/libavcodec/jpeg2000dsp.c index b1bff6d5b1..50bc1ecee6 100644 --

[FFmpeg-devel] [PATCH 2/3] lavc/jpeg2000dsp: R-V V ict_float

2023-10-28 Thread Rémi Denis-Courmont
ECODER) += riscv/huffyuvdsp_init.o diff --git a/libavcodec/riscv/jpeg2000dsp_init.c b/libavcodec/riscv/jpeg2000dsp_init.c new file mode 100644 index 00..9415a22f79 --- /dev/null +++ b/libavcodec/riscv/jpeg2000dsp_init.c @@ -0,0 +1,36 @@ +/* + * Copyright © 2023 Rémi Denis-Courmont. + * + * This f

[FFmpeg-devel] [PATCH 3/3] lavc/jpeg2000dsp: R-V V rct_int

2023-10-28 Thread Rémi Denis-Courmont
jpeg2000_rct_int_c: 2592.2 jpeg2000_rct_int_rvv_i32: 1154.2 --- libavcodec/riscv/jpeg2000dsp_init.c | 8 ++-- libavcodec/riscv/jpeg2000dsp_rvv.S | 23 +++ 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/libavcodec/riscv/jpeg2000dsp_init.c b/libavcod

[FFmpeg-devel] [PATCH] lavc/pixblockdsp: remove R-V V get_pixels_16

2023-10-29 Thread Rémi Denis-Courmont
In the aligned case, the existing RVI assembler is actually much faster. In the unaligned case, there is nothing much to gain over C. --- libavcodec/riscv/pixblockdsp_init.c | 7 +-- libavcodec/riscv/pixblockdsp_rvv.S | 7 --- 2 files changed, 1 insertion(+), 13 deletions(-) diff --git a

Re: [FFmpeg-devel] [RFC] financial sustainability Plan A (SPI)

2023-10-29 Thread Rémi Denis-Courmont
Hi, Le 28 octobre 2023 21:01:57 GMT+03:00, Michael Niedermayer a écrit : >On Sat, Oct 28, 2023 at 07:21:03PM +0200, Michael Niedermayer wrote: >> Hi ronald >> >> On Sat, Oct 28, 2023 at 12:43:15PM -0400, Ronald S. Bultje wrote: >> > Hi Thilo, >> > >> > On Sat, Oct 28, 2023 at 11:31 AM Thilo Bo

Re: [FFmpeg-devel] [RFC] financial sustainability Plan A (SPI)

2023-10-29 Thread Rémi Denis-Courmont
Le sunnuntaina 29. lokakuuta 2023, 18.12.58 EET Michael Niedermayer a écrit : > On Sun, Oct 29, 2023 at 04:35:35PM +0200, Rémi Denis-Courmont wrote: > > Hi, > > > > Le 28 octobre 2023 21:01:57 GMT+03:00, Michael Niedermayer a écrit : > > >On Sat, Oct 28, 202

Re: [FFmpeg-devel] [RFC] financial sustainability Plan A (SPI)

2023-10-29 Thread Rémi Denis-Courmont
Le sunnuntaina 29. lokakuuta 2023, 18.47.34 EET Nicolas George a écrit : > Rémi Denis-Courmont (12023-10-29): > > And unfortunately, I do believe that Ronald is correct in pointing out > > that big companies will want oversight in exchange for money. > > And this is why the o

[FFmpeg-devel] [PATCH 1/3] lavc/sbrdsp: R-V V sum64x5

2023-10-29 Thread Rémi Denis-Courmont
/libavcodec/riscv/sbrdsp_init.c @@ -0,0 +1,37 @@ +/* + * Copyright © 2023 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software

[FFmpeg-devel] [PATCH 2/3] lavc/sbrdsp: R-V V sum_square

2023-10-29 Thread Rémi Denis-Courmont
sum_square_c: 803.5 sum_square_rvv_f32: 283.2 --- libavcodec/riscv/sbrdsp_init.c | 2 ++ libavcodec/riscv/sbrdsp_rvv.S | 19 +++ 2 files changed, 21 insertions(+) diff --git a/libavcodec/riscv/sbrdsp_init.c b/libavcodec/riscv/sbrdsp_init.c index 837f24e1e0..e0e62278b0 1006

[FFmpeg-devel] [PATCH 3/3] lavc/sbrdsp: R-V V neg_odd_64

2023-10-29 Thread Rémi Denis-Courmont
With 128-bit vectors, this is mostly pointless but also harmless. Performance gains should be more noticeable with larger vector sizes. neg_odd_64_c: 76.2 neg_odd_64_rvv_i64: 74.7 --- libavcodec/riscv/sbrdsp_init.c | 5 + libavcodec/riscv/sbrdsp_rvv.S | 17 + 2 files c

Re: [FFmpeg-devel] [RFC] financial sustainability Plan A (SPI)

2023-10-31 Thread Rémi Denis-Courmont
through would probably not be newsworthy. And it seems unlikely that major ones like Kodi, mpv, VLC, etc, would let this slip through in the first place. > All these news articles are free amplification of the message ;) That most probably will not happen, and if it does, it will most

[FFmpeg-devel] [PATCH] lavc/sbrdsp: R-V V sbr_hf_g_filt

2023-11-01 Thread Rémi Denis-Courmont
hf_g_filt_c: 1552.5 hf_g_filt_rvv_f32: 679.5 --- libavcodec/riscv/sbrdsp_init.c | 3 +++ libavcodec/riscv/sbrdsp_rvv.S | 20 2 files changed, 23 insertions(+) diff --git a/libavcodec/riscv/sbrdsp_init.c b/libavcodec/riscv/sbrdsp_init.c index 1b85b2cae9..71de681185 1006

[FFmpeg-devel] [PATCH] lavc/pixblockdsp: rework R-V V get_pixels_unaligned

2023-11-01 Thread Rémi Denis-Courmont
As in the aligned case, we can use VLSE64.V, though the way of doing so gets more convoluted, so the performance gains are more modest: get_pixels_unaligned_c: 126.7 get_pixels_unaligned_rvv_i32: 145.5 (before) get_pixels_unaligned_rvv_i64: 62.2 (after) For the reference, those are the ali

Re: [FFmpeg-devel] FFmpeg at NAB 2024

2023-11-02 Thread Rémi Denis-Courmont
Hi, FWIW, FFmpeg will most probably be granted a free community booth at the next SCaLE in 21x a month earlier also in South-Western USA. If this unfolds as it usually does, we will get confirmation in January. There are no hidden costs *there*. But of course it's a very different crowd of vis

[FFmpeg-devel] [PATCH] lavc/opusdsp: rewrite R-V V postfilter

2023-11-02 Thread Rémi Denis-Courmont
This uses a more traditional approach allowing up processing of up to period minus two elements per iteration. This also allows the algorithm to work for all and any vector length. As the T-Head C908 device under test can load 16 elements loop, there is unsurprisingly a little performance drop whe

Re: [FFmpeg-devel] [PATCH] lavc/opusdsp: rewrite R-V V postfilter

2023-11-02 Thread Rémi Denis-Courmont
Le torstaina 2. marraskuuta 2023, 23.07.03 EET Rémi Denis-Courmont a écrit : > This uses a more traditional approach allowing up processing of up to > period minus two elements per iteration. This also allows the algorithm > to work for all and any vector length. > > As the T-H

Re: [FFmpeg-devel] [PATCH 2/6] libavformat/sdp: remove whitespaces in fmtp

2023-11-06 Thread Rémi Denis-Courmont
Le maanantaina 6. marraskuuta 2023, 17.36.18 EET Kieran Kunhya a écrit : > On Mon, 6 Nov 2023 at 15:19, Michael Riedl > > wrote: > > Whitespaces after semicolon breaks some servers > > Are you sure this patch doesn't break other servers? SDP is a painfully > fragile format. AFAIK, you're not su

[FFmpeg-devel] [PATCH 2/2] lavc/aacpsdsp: rework R-V V hybrid_synthesis_deint

2023-11-08 Thread Rémi Denis-Courmont
Given the size of the data set, strided memory accesses cannot be avoided. We can still do better than the current code. ps_hybrid_synthesis_deint_c: 12065.5 ps_hybrid_synthesis_deint_rvv_i32: 13650.2 (before) ps_hybrid_synthesis_deint_rvv_i64: 8181.0 (after) --- libavcodec/riscv/aacpsdsp_

[FFmpeg-devel] [PATCH 1/2] lavc/aacpsdsp: rework R-V V add_squares

2023-11-08 Thread Rémi Denis-Courmont
--- a/libavcodec/riscv/aacpsdsp_rvv.S +++ b/libavcodec/riscv/aacpsdsp_rvv.S @@ -1,5 +1,5 @@ /* - * Copyright © 2022 Rémi Denis-Courmont. + * Copyright © 2022-2023 Rémi Denis-Courmont. * * This file is part of FFmpeg. * @@ -20,13 +20,16 @@ #include "libavutil/riscv/asm.S" -f

[FFmpeg-devel] [PATCH] lavc/sbrdsp: R-V V autocorrelate

2023-11-08 Thread Rémi Denis-Courmont
With 5 accumulator vectors and 6 inputs, this can only use LMUL=2. Also the number of vector loop iterations is small, just 5 on 128-bit vector hardware. The vector loop is somewhat unusual in that it processes data in descending memory order, in order to save on vector slides: in descending order

Re: [FFmpeg-devel] [PATCH v28 1/2] avcodec/evc_encoder: Provided support for EVC encoder

2023-11-09 Thread Rémi Denis-Courmont
Hi, Le 9 novembre 2023 12:16:28 GMT+02:00, "Dawid Kozinski/Multimedia (PLT) /SRPOL/Staff Engineer/Samsung Electronics" a écrit : >Hi, > >Both, the implementation of the EVC encoder and decoder for FFmpeg depend on >external libraries (at least for now). They are just wrappers using external >

Re: [FFmpeg-devel] [ANNOUNCE] upcoming vote: extra members for GA

2023-11-09 Thread Rémi Denis-Courmont
Le torstaina 9. marraskuuta 2023, 18.50.52 EET Michael Niedermayer a écrit : > that said, i checked ML subscribers and found > 16 of the people above to be currently subscribed with email addresses > that i found by greping their name. (not posting the list due to privacy > concerns) Thing is, if

Re: [FFmpeg-devel] [ANNOUNCE] upcoming vote: extra members for GA

2023-11-09 Thread Rémi Denis-Courmont
Le torstaina 9. marraskuuta 2023, 19.41.53 EET Michael Niedermayer a écrit : > On Thu, Nov 09, 2023 at 07:04:00PM +0200, Rémi Denis-Courmont wrote: > > Le torstaina 9. marraskuuta 2023, 18.50.52 EET Michael Niedermayer a écrit : > > > that said, i checked ML subscribers and fou

[FFmpeg-devel] [PATCH 1/2] sws/rgb2rgb: rework R-V V YUY2 to 4:2:2 planar

2023-11-09 Thread Rémi Denis-Courmont
This saves three scratch registers and three instructions per line. The performance gains are mostly negligible. The main point is to free up registers for further rework. --- libswscale/riscv/rgb2rgb_rvv.S | 25 - 1 file changed, 12 insertions(+), 13 deletions(-) diff --g

[FFmpeg-devel] [PATCH 2/2] sws/rgb2rgb: fix unaligned accesses in R-V V YUYV to I422p

2023-11-09 Thread Rémi Denis-Courmont
In my personal opinion, we should not need to support unaligned YUY2 pixel maps. They should always be aligned to at least 32 bits, and the current code assumes just 16 bits. However checkasm does test for unaligned input bitmaps. QEMU accepts it, but real hardware dose not. In this particular cas

Re: [FFmpeg-devel] [ANNOUNCE] upcoming vote: extra members for GA

2023-11-09 Thread Rémi Denis-Courmont
Le torstaina 9. marraskuuta 2023, 20.11.12 EET Cosmin Stejerean via ffmpeg- devel a écrit : > > On Nov 9, 2023, at 9:53 AM, Rémi Denis-Courmont wrote: > > > > The point is that, whether or not they are on the mailing list, people > > should > > not be volunteer

Re: [FFmpeg-devel] [PATCH 2/2] sws/rgb2rgb: fix unaligned accesses in R-V V YUYV to I422p

2023-11-09 Thread Rémi Denis-Courmont
Le torstaina 9. marraskuuta 2023, 20.34.53 EET Rémi Denis-Courmont a écrit : > In my personal opinion, we should not need to support unaligned YUY2 > pixel maps. They should always be aligned to at least 32 bits, and the > current code assumes just 16 bits. However checkasm does

[FFmpeg-devel] [PATCH] checkasm: test with random bw value

2023-11-09 Thread Rémi Denis-Courmont
With a value of zero, the function is a glorified memory copy. --- tests/checkasm/sbrdsp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/checkasm/sbrdsp.c b/tests/checkasm/sbrdsp.c index 2fb14d5bf8..5cc3b33215 100644 --- a/tests/checkasm/sbrdsp.c +++ b/tests/checkas

Re: [FFmpeg-devel] [PATCH] avcodec/mpegvideo: Remove spec-incompliant inverse quantisation

2023-11-09 Thread Rémi Denis-Courmont
Le torstaina 9. marraskuuta 2023, 22.45.35 EET Alexander Strasser a écrit : > I can't see how the reason for the presence of code can be ultimately > defined objectively and non-arbitrary. Ultimately, this was discussed and decided in a meeting, which Michael attended (albeit remotely) and for wh

[FFmpeg-devel] [PATCH] lavc/sbrdsp: R-V V hf_gen

2023-11-09 Thread Rémi Denis-Courmont
hf_gen_c: 2922.7 hf_gen_rvv_f32: 731.5 --- libavcodec/riscv/sbrdsp_init.c | 4 +++ libavcodec/riscv/sbrdsp_rvv.S | 50 ++ 2 files changed, 54 insertions(+) diff --git a/libavcodec/riscv/sbrdsp_init.c b/libavcodec/riscv/sbrdsp_init.c index c1ed5b639c..e573645

Re: [FFmpeg-devel] [ANNOUNCE] upcoming vote: extra members for GA

2023-11-10 Thread Rémi Denis-Courmont
Le 10 novembre 2023 12:54:30 GMT+02:00, Hendrik Leppkes a écrit : >On Thu, Nov 9, 2023 at 6:04 PM Rémi Denis-Courmont wrote: >> >> Le torstaina 9. marraskuuta 2023, 18.50.52 EET Michael Niedermayer a écrit : >> > that said, i checked ML subscribers and found >>

[FFmpeg-devel] [PATCHv2 1/2] sws/rgb2rgb: rework R-V V YUY2 to 4:2:2 planar

2023-11-10 Thread Rémi Denis-Courmont
This saves three scratch registers and three instructions per line. The performance gains are mostly negligible. The main point is to free up registers for further rework. --- libswscale/riscv/rgb2rgb_rvv.S | 25 - 1 file changed, 12 insertions(+), 13 deletions(-) diff --g

[FFmpeg-devel] [PATCHv2 2/2] sws/rgb2rgb: fix unaligned accesses in R-V V YUYV to I422p

2023-11-10 Thread Rémi Denis-Courmont
In my personal opinion, we should not need to support unaligned YUY2 pixel maps. They should always be aligned to at least 32 bits, and the current code assumes just 16 bits. However checkasm does test for unaligned input bitmaps. QEMU accepts it, but real hardware dose not. In this particular cas

[FFmpeg-devel] [PATCH] checkasm: test the noise case of sbrdsp.hf_apply_noise

2023-11-10 Thread Rémi Denis-Courmont
The tested functions treat s_m[i] == 0 as a special case. Other than that, the functions are slightly complicated vector additions. This actually makes the zero case happen pseudorandomly. --- tests/checkasm/sbrdsp.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tests/ch

[FFmpeg-devel] [PATCH] lavc/sbrdsp: R-V V hf_apply_noise functions

2023-11-10 Thread Rémi Denis-Courmont
This is restricted to 128-bit vectors as larger vector sizes could read past the end of the noise array. Support for future hardware with larger vector sizes is left for some other time. hf_apply_noise_0_c: 2319.7 hf_apply_noise_0_rvv_f32: 1229.0 hf_apply_noise_1_c: 2539.0 hf_apply_noi

Re: [FFmpeg-devel] [ANNOUNCE] Repeat vote: GA voters list updates

2023-11-11 Thread Rémi Denis-Courmont
It should go without spelling it out but such community-hostile attitude seems very ill-advised to me for somebody who is running for CC election or reelection. -- Rémi Denis-Courmont http://www.remlab.net/ ___ ffmpeg-devel mailing list ffmpeg-de

Re: [FFmpeg-devel] [ANNOUNCE] Repeat vote: GA voters list updates

2023-11-11 Thread Rémi Denis-Courmont
Le lauantaina 11. marraskuuta 2023, 13.15.37 EET Nicolas George a écrit : > Rémi Denis-Courmont (12023-11-11): > > 1) As far as was communicated, the total of alleged discrepancies in the > > voter list could not affect the result. That makes the vote valid in my > > book, and

Re: [FFmpeg-devel] [ANNOUNCE] upcoming vote: TC/CC elections

2023-11-11 Thread Rémi Denis-Courmont
Le sunnuntaina 5. marraskuuta 2023, 12.02.05 EET Anton Khirnov a écrit : > Anyone else wishing to volunteer for TC or CC, please reply to this > email. I hereby "volunteer" for the CC. For those who don't know me, I am a research engineer in system software working for a large telecommunication

[FFmpeg-devel] [PATCH] lavc/exrdsp: unroll predictor

2023-11-11 Thread Rémi Denis-Courmont
With explicit unrolling, we can skip half of the sign bit flips, and the compiler is then better able to optimise the scalar loop: predictor_c: 31376.0 (before) predictor_c: 23703.0 (after) --- libavcodec/exrdsp.c | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git

[FFmpeg-devel] [PATCH] lavc/opusdsp: R-V V deemphasis function

2023-11-11 Thread Rémi Denis-Courmont
Considering the marginality of the measured performance gains (3-4%), I suppose that we should not merge this. Furthermore those measurements are not expected to improve with large vector sizes, since the code uses only 32 bits per vector no matter what. deemphasis_c: 7703.2 deemphasis_rvv_f32: 74

[FFmpeg-devel] [PATCH] checkasm/huffyuvdsp: test for add_hfyu_left_pred_bgr32

2023-11-12 Thread Rémi Denis-Courmont
--- tests/checkasm/huffyuvdsp.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/tests/checkasm/huffyuvdsp.c b/tests/checkasm/huffyuvdsp.c index 6ba27e267f..a08f5a8391 100644 --- a/tests/checkasm/huffyuvdsp.c +++ b/tests/checkasm/huffyuvdsp.c @@ -64,6 +64,34 @@ s

[FFmpeg-devel] [PATCH 1/1] lavc/huffyuvdsp: basic R-V V add_hfyu_left_pred_bgr32

2023-11-12 Thread Rémi Denis-Courmont
Better performance can probably be achieved with a more intricate unrolled loop, but this is a start: add_hfyu_left_pred_bgr32_c: 15084.0 add_hfyu_left_pred_bgr32_rvv_i32: 10280.2 This would actually be cleaner with the RISC-V P extension, but that is not ratified yet (I think?) and usually not s

[FFmpeg-devel] [PATCH] checkasm: add lossless audio DSP

2023-11-12 Thread Rémi Denis-Courmont
--- tests/checkasm/Makefile | 1 + tests/checkasm/checkasm.c | 3 + tests/checkasm/checkasm.h | 1 + tests/checkasm/llauddsp.c | 115 ++ 4 files changed, 120 insertions(+) create mode 100644 tests/checkasm/llauddsp.c diff --git a/tests/checkasm/Makefil

[FFmpeg-devel] [PATCH 2/2] lavc/llauddsp: R-V V scalarproduct_and_madd_int32

2023-11-12 Thread Rémi Denis-Courmont
scalarproduct_and_madd_int32_c: 10899.7 scalarproduct_and_madd_int32_rvv_i32: 1749.0 --- libavcodec/riscv/llauddsp_init.c | 4 libavcodec/riscv/llauddsp_rvv.S | 26 ++ 2 files changed, 30 insertions(+) diff --git a/libavcodec/riscv/llauddsp_init.c b/libavcodec/

[FFmpeg-devel] [PATCH 1/2] lavc/llauddsp: R-V V scalarproduct_and_madd_int16

2023-11-12 Thread Rémi Denis-Courmont
) += riscv/pixblockdsp_init.o \ diff --git a/libavcodec/riscv/llauddsp_init.c b/libavcodec/riscv/llauddsp_init.c new file mode 100644 index 00..ea023f73e6 --- /dev/null +++ b/libavcodec/riscv/llauddsp_init.c @@ -0,0 +1,40 @@ +/* + * Copyright © 2023 Rémi Denis-Courmont. + * + * This file is

Re: [FFmpeg-devel] [PATCH] checkasm: add lossless audio DSP

2023-11-12 Thread Rémi Denis-Courmont
Hi, This seems to show that the SSSE3 optimisation is no better than the SSE2, at least on my AMD Ryzen. Does anyone know why it's there? Should it be purged? Br, ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo

Re: [FFmpeg-devel] [PATCH] checkasm: add lossless audio DSP

2023-11-13 Thread Rémi Denis-Courmont
Le 13 novembre 2023 11:07:21 GMT+02:00, Paul B Mahol a écrit  : >On Mon, Nov 13, 2023 at 7:42 AM Rémi Denis-Courmont wrote: > >> Hi, >> >> This seems to show that the SSSE3 optimisation is no better than the SSE2, >> at least on my AMD Ryzen. Does anyone kno

Re: [FFmpeg-devel] [PATCH] af_afir: RISC-V V fcmul_add

2023-11-13 Thread Rémi Denis-Courmont
Hi, Le maanantaina 13. marraskuuta 2023, 11.43.01 EET flow gg a écrit : > Sorry for the long delay in responding. No problem. Working with T-Head C910 (or C920?) cores is very tedious. I gave up on that and switched over to Kendryte K230 (based on C908) now. > How is the modified patch now?

Re: [FFmpeg-devel] [PATCH] checkasm: add lossless audio DSP

2023-11-13 Thread Rémi Denis-Courmont
Le maanantaina 13. marraskuuta 2023, 11.17.57 EET Rémi Denis-Courmont a écrit : > Le 13 novembre 2023 11:07:21 GMT+02:00, Paul B Mahol a écrit : > >On Mon, Nov 13, 2023 at 7:42 AM Rémi Denis-Courmont wrote: > >> Hi, > >> > >> This seems to show that the

[FFmpeg-devel] [PATCH] checkasm/flacdsp: fix ls/rs/ms tests

2023-11-13 Thread Rémi Denis-Courmont
decorrelate_ls, _rs and _ms are decorrelate[1], [2] and [3] respectively. The code ended up testing indep ([0]) as twice, ms never, and misnaming the other two. --- tests/checkasm/flacdsp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/checkasm/flacdsp.c b/tests/checkas

[FFmpeg-devel] [PATCH 1/2] lavc/flacdsp: R-V V packed decorrelate_{l, r}s

2023-11-13 Thread Rémi Denis-Courmont
/libavcodec/riscv/flacdsp_init.c b/libavcodec/riscv/flacdsp_init.c new file mode 100644 index 00..a3415d6d55 --- /dev/null +++ b/libavcodec/riscv/flacdsp_init.c @@ -0,0 +1,55 @@ +/* + * Copyright © 2023 Rémi Denis-Courmont. + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can

[FFmpeg-devel] [PATCH 2/2] lavc/flacdsp: R-V V decorrelate_ms packed

2023-11-13 Thread Rémi Denis-Courmont
flac_decorrelate_ms_16_c: 585.5 flac_decorrelate_ms_16_rvv_i32: 263.0 flac_decorrelate_ms_32_c: 584.7 flac_decorrelate_ms_32_rvv_i32: 250.0 --- libavcodec/riscv/flacdsp_init.c | 6 libavcodec/riscv/flacdsp_rvv.S | 49 + 2 files changed, 55 inserti

Re: [FFmpeg-devel] [ANNOUNCE] Repeat vote: GA voters list updates

2023-11-14 Thread Rémi Denis-Courmont
Le tiistaina 14. marraskuuta 2023, 17.56.24 EET Tomas Härdin a écrit : > Ballots should be public IMO, secret voting is cowardice. The French (XIXth century) Empire used notoriously public ballots, and the results were skewed to say the least. There is a good reason why ballots are supposed to b

Re: [FFmpeg-devel] [PATCH v2 2/6] libavformat/sdp: remove whitespaces in fmtp

2023-11-14 Thread Rémi Denis-Courmont
peg do send them, or just by implementation accident. -- Rémi Denis-Courmont http://www.remlab.net/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-d

[FFmpeg-devel] [PATCH] lavc/flacdsp: R-V V decorrelate_indep 32-bit

2023-11-14 Thread Rémi Denis-Courmont
flac_decorrelate_indep2_32_c: 981.7 flac_decorrelate_indep2_32_rvv_i32: 183.7 flac_decorrelate_indep4_32_c: 1749.7 flac_decorrelate_indep4_32_rvv_i32: 362.5 flac_decorrelate_indep6_32_c: 2517.7 flac_decorrelate_indep6_32_rvv_i32: 715.2 flac_decorrelate_indep8_32_c: 3285.7 flac_

<    1   2   3   4   5   6   7   8   9   10   >