Le torstaina 15. kesäkuuta 2023, 16.57.18 EEST Lynne a écrit : > Jun 15, 2023, 12:37 by shenpeit...@eswincomputing.com: > > From: Shen Peiting <shenpeit...@eswincomputing.com> > > > > We optimized the six interfaces of AC3 init by RVV, the optimized > > performance was tested on the RISC-V ISA simulator--Spike, and the > > results were attached to each commit. > > > > shenpeiting (6): > > lavc/ac3dsp: RISC-V V ac3_exponent_min > > lavc/ac3dsp: RISC-V V float_to_fixed24 > > lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_int32 > > lavc/ac3dsp: RISC-V V ac3_sum_square_butterfly_float > > lavc/ac3dsp: RISC-V V ac3_compute_mantissa_size > > lavc/ac3dsp: RISC-V B ac3_extract_exponents > > > > libavcodec/ac3dsp.c | 2 + > > libavcodec/ac3dsp.h | 1 + > > libavcodec/riscv/Makefile | 3 + > > libavcodec/riscv/ac3dsp_init.c | 60 +++++++++ > > libavcodec/riscv/ac3dsp_rvb.S | 42 ++++++ > > libavcodec/riscv/ac3dsp_rvv.S | 225 +++++++++++++++++++++++++++++++++ > > 6 files changed, 333 insertions(+) > > create mode 100644 libavcodec/riscv/ac3dsp_init.c > > create mode 100644 libavcodec/riscv/ac3dsp_rvb.S > > create mode 100644 libavcodec/riscv/ac3dsp_rvv.S > > Could you implement checkasm for this? It shouldn't > be more than a hundred lines, and there are examples, > tests/checkasm/aacpsdsp.c being the most similar. > Since CPUs with the needed extensions aren't released, > we're not doing any FATE runs,
Well... I accept hardware donations (with regular USB-C power supply and passive cooling) to back what would be the third generation of RISC-V FATE instances. Until R-V-V 1.0 hardware production substitutes unobtainium for silicium, I also accept Lichee Pi4A or equivalent hardware bundles, which would be able to run most (but definitely not all) of FFmpeg's RVV functions with a sizable amount of kludging. > and so if the results don't > match the C version, we'll end up with broken code once > they do exist. And no one wants to debug someone else's > assembly. > > Those results look far too optimistic, and I'm guessing > it's because they're using a theoretical huge vector size > limit. Could you re-test with something more realistic, > like 256-bit vectors, using checkasm --bench? It could also be that Spike counts everything as one cycle, regardless of the group multipler, not (just) the vector size. -- Rémi Denis-Courmont http://www.remlab.net/ _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".