Re: [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp

2023-06-15 Thread Lynne
Jun 15, 2023, 12:37 by shenpeit...@eswincomputing.com:

> From: Shen Peiting 
>
> We optimized the six interfaces of AC3 init by RVV, the optimized 
> performance was tested on the RISC-V ISA simulator--Spike, and the 
> results were attached to each commit.
>
> shenpeiting (6):
>  lavc/ac3dsp: RISC-V V ac3_exponent_min
>  lavc/ac3dsp: RISC-V V float_to_fixed24
>  lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32
>  lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_float
>  lavc/ac3dsp: RISC-V V ac3_compute_mantissa_size
>  lavc/ac3dsp: RISC-V B ac3_extract_exponents
>
>  libavcodec/ac3dsp.c|   2 +
>  libavcodec/ac3dsp.h|   1 +
>  libavcodec/riscv/Makefile  |   3 +
>  libavcodec/riscv/ac3dsp_init.c |  60 +
>  libavcodec/riscv/ac3dsp_rvb.S  |  42 ++
>  libavcodec/riscv/ac3dsp_rvv.S  | 225 +
>  6 files changed, 333 insertions(+)
>  create mode 100644 libavcodec/riscv/ac3dsp_init.c
>  create mode 100644 libavcodec/riscv/ac3dsp_rvb.S
>  create mode 100644 libavcodec/riscv/ac3dsp_rvv.S
>

Could you implement checkasm for this? It shouldn't
be more than a hundred lines, and there are examples,
tests/checkasm/aacpsdsp.c being the most similar.
Since CPUs with the needed extensions aren't released,
we're not doing any FATE runs, and so if the results don't
match the C version, we'll end up with broken code once
they do exist. And no one wants to debug someone else's
assembly.

Those results look far too optimistic, and I'm guessing
it's because they're using a theoretical huge vector size
limit. Could you re-test with something more realistic,
like 256-bit vectors, using checkasm --bench?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 0/6] RISC-V initial ac3dsp

2023-06-15 Thread Rémi Denis-Courmont
Le torstaina 15. kesäkuuta 2023, 16.57.18 EEST Lynne a écrit :
> Jun 15, 2023, 12:37 by shenpeit...@eswincomputing.com:
> > From: Shen Peiting 
> > 
> > We optimized the six interfaces of AC3 init by RVV, the optimized
> > performance was tested on the RISC-V ISA simulator--Spike, and the
> > results were attached to each commit.
> > 
> > shenpeiting (6):
> >  lavc/ac3dsp: RISC-V V ac3_exponent_min
> >  lavc/ac3dsp: RISC-V V float_to_fixed24
> >  lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32
> >  lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_float
> >  lavc/ac3dsp: RISC-V V ac3_compute_mantissa_size
> >  lavc/ac3dsp: RISC-V B ac3_extract_exponents
> >  
> >  libavcodec/ac3dsp.c|   2 +
> >  libavcodec/ac3dsp.h|   1 +
> >  libavcodec/riscv/Makefile  |   3 +
> >  libavcodec/riscv/ac3dsp_init.c |  60 +
> >  libavcodec/riscv/ac3dsp_rvb.S  |  42 ++
> >  libavcodec/riscv/ac3dsp_rvv.S  | 225 +
> >  6 files changed, 333 insertions(+)
> >  create mode 100644 libavcodec/riscv/ac3dsp_init.c
> >  create mode 100644 libavcodec/riscv/ac3dsp_rvb.S
> >  create mode 100644 libavcodec/riscv/ac3dsp_rvv.S
> 
> Could you implement checkasm for this? It shouldn't
> be more than a hundred lines, and there are examples,
> tests/checkasm/aacpsdsp.c being the most similar.
> Since CPUs with the needed extensions aren't released,
> we're not doing any FATE runs,

Well... I accept hardware donations (with regular USB-C power supply and 
passive cooling) to back what would be the third generation of RISC-V FATE 
instances.

Until R-V-V 1.0 hardware production substitutes unobtainium for silicium, I 
also accept Lichee Pi4A or equivalent hardware bundles, which would be able to 
run most (but definitely not all) of FFmpeg's RVV functions with a sizable 
amount of kludging.

> and so if the results don't
> match the C version, we'll end up with broken code once
> they do exist. And no one wants to debug someone else's
> assembly.
> 
> Those results look far too optimistic, and I'm guessing
> it's because they're using a theoretical huge vector size
> limit. Could you re-test with something more realistic,
> like 256-bit vectors, using checkasm --bench?

It could also be that Spike counts everything as one cycle, regardless of the 
group multipler, not (just) the vector size.

-- 
Rémi Denis-Courmont
http://www.remlab.net/



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".