Le torstaina 15. kesäkuuta 2023, 17.58.37 EEST Arnie Chang a écrit :
> Since these functions are frequently called, I prefer instantiating similar
> code many times
> rather than calling another internal function, as it may introduce
> additional function call overhead.
This works both ways.
On Wed, Jun 14, 2023 at 11:57 PM Rémi Denis-Courmont
wrote:
> It looks like \width is only ever used as AVL. You could advantageously
> pass
> it as a run-time argument to an internal function, and spare the
> instruction
> cache, instead of instantiating otherwise identical code thrice.
>
Le perjantaina 9. kesäkuuta 2023, 10.17.27 EEST Arnie Chang a écrit :
> Optimize the put and avg filtering for 4xH and 2xH blocks
>
> Signed-off-by: Arnie Chang
> ---
> checkasm: using random seed 3475799765
> RVVi32:
> - h264chroma.chroma_mc [OK]
> checkasm: all 6 tests passed
>
Le maanantaina 12. kesäkuuta 2023, 18.28.34 EEST Arnie Chang a écrit :
> On Mon, Jun 12, 2023 at 10:59 PM Rémi Denis-Courmont
>
> wrote:
> > It would seem more simpler and more intuitive to just use `.if` here.
> > (Ditto
> > below.)
>
> hi,
> Do you mean using .if to modify this line of code?
On Mon, Jun 12, 2023 at 10:59 PM Rémi Denis-Courmont
wrote:
> It would seem more simpler and more intuitive to just use `.if` here.
> (Ditto
> below.)
>
hi,
Do you mean using .if to modify this line of code?
+vsetivlit3, \width, e8, m1, ta, mu
Le perjantaina 9. kesäkuuta 2023, 10.17.27 EEST Arnie Chang a écrit :
> Optimize the put and avg filtering for 4xH and 2xH blocks
>
> Signed-off-by: Arnie Chang
> diff --git a/libavcodec/riscv/h264_mc_chroma.S
> b/libavcodec/riscv/h264_mc_chroma.S index 364bc3156e..c97cdbad86 100644
> ---
On Sat, Jun 10, 2023 at 10:55 PM Lynne wrote:
> Why do they all have the same timing?
>
The processing procedure for these workloads is the same,
except for the difference in block width. (8xH, 4xH, 2xH)
So, the number of instructions remains constant.
Since these workloads handle a small
Jun 9, 2023, 09:17 by arnie.chang-at-sifive@ffmpeg.org:
> Optimize the put and avg filtering for 4xH and 2xH blocks
>
> Signed-off-by: Arnie Chang
> ---
> checkasm: using random seed 3475799765
> RVVi32:
> - h264chroma.chroma_mc [OK]
> checkasm: all 6 tests passed
> avg_h264_chroma_mc1_8_c:
Optimize the put and avg filtering for 4xH and 2xH blocks
Signed-off-by: Arnie Chang
---
checkasm: using random seed 3475799765
RVVi32:
- h264chroma.chroma_mc [OK]
checkasm: all 6 tests passed
avg_h264_chroma_mc1_8_c: 1821.5
avg_h264_chroma_mc1_8_rvv_i32: 466.5
avg_h264_chroma_mc2_8_c: 939.2