Re: [FFmpeg-devel] [PATCH 2/4] lavc/rv34dsp: R-V V rv34_inv_transform_dc

2024-02-10 Thread flow gg
Happy new year ~ Yes, I've tried reordering. Rémi Denis-Courmont 于2024年2月10日周六 17:18写道: > Happy new year, > > The gains are -unsurprisingly- modest here. Did you try to reorder > instructions to improve scheduling? > > -- > Rémi Denis-Courmont > http://www.remlab.net/ > > > >

Re: [FFmpeg-devel] [PATCH 2/4] lavc/rv34dsp: R-V V rv34_inv_transform_dc

2024-02-10 Thread Rémi Denis-Courmont
Happy new year, The gains are -unsurprisingly- modest here. Did you try to reorder instructions to improve scheduling? -- Rémi Denis-Courmont http://www.remlab.net/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org

Re: [FFmpeg-devel] [PATCH 2/4] lavc/rv34dsp: R-V V rv34_inv_transform_dc

2024-02-09 Thread flow gg
Okay, I have updated them in the response Rémi Denis-Courmont 于2024年2月10日周六 05:14写道: > Le keskiviikkona 7. helmikuuta 2024, 2.12.22 EET flow gg a écrit : > > My carelessness.. fixed it in the reply. > > I know I said to avoid scalar multiplications, but this may be taking it a > little too far.

Re: [FFmpeg-devel] [PATCH 2/4] lavc/rv34dsp: R-V V rv34_inv_transform_dc

2024-02-09 Thread Rémi Denis-Courmont
Le keskiviikkona 7. helmikuuta 2024, 2.12.22 EET flow gg a écrit : > My carelessness.. fixed it in the reply. I know I said to avoid scalar multiplications, but this may be taking it a little too far. Either this works: slli t1, t0, 9 sh2add t0, t0, t0 sub t0, t1, t0 or just: li t1,

Re: [FFmpeg-devel] [PATCH 2/4] lavc/rv34dsp: R-V V rv34_inv_transform_dc

2024-02-06 Thread flow gg
My carelessness.. fixed it in the reply. Rémi Denis-Courmont 于2024年2月7日周三 01:26写道: > Hi, > > I'm not sure why you're mixing element sizes this way, but the code should > not > even compile due to mismatched extensions. > > -- > Rémi Denis-Courmont > http://www.remlab.net/ > > > >

Re: [FFmpeg-devel] [PATCH 2/4] lavc/rv34dsp: R-V V rv34_inv_transform_dc

2024-02-06 Thread Rémi Denis-Courmont
Hi, I'm not sure why you're mixing element sizes this way, but the code should not even compile due to mismatched extensions. -- Rémi Denis-Courmont http://www.remlab.net/ ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org

Re: [FFmpeg-devel] [PATCH 2/4] lavc/rv34dsp: R-V V rv34_inv_transform_dc

2024-01-31 Thread flow gg
> Also fractional multipler should never be smaller than the ratio of the > specified element size to the largest element size used in the function. Here > it is largelly inconsequential, but for instance "e32, mf4" and "e64, mf2" are > invalid. Thanks, I indeed almost forgot about this part > I

Re: [FFmpeg-devel] [PATCH 2/4] lavc/rv34dsp: R-V V rv34_inv_transform_dc

2024-01-31 Thread Rémi Denis-Courmont
Hi, I think this breaks the build for RV32, and it lacks checks for the vector length. Also fractional multipler should never be smaller than the ratio of the specified element size to the largest element size used in the function. Here it is largelly inconsequential, but for instance "e32,

[FFmpeg-devel] [PATCH 2/4] lavc/rv34dsp: R-V V rv34_inv_transform_dc

2024-01-31 Thread flow gg
From 7e1c8d6b73afad9885222c0c9012543aface5397 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Wed, 31 Jan 2024 19:03:20 +0800 Subject: [PATCH 2/4] lavc/rv34dsp: R-V V rv34_inv_transform_dc C908: rv34_inv_transform_dc_c: 35.5 rv34_inv_transform_dc_rvv_i32: 27.0 --- libavcodec/riscv/Makefile