Re: [FFmpeg-devel] [PATCH 2/2] lavc/flacdsp: implement wasted32 DSP function for VSX on POWER

2024-07-07 Thread Sean McGovern
Hi, On Sat, Jul 6, 2024, 16:18 Rémi Denis-Courmont wrote: > Le lauantaina 6. heinäkuuta 2024, 23.00.47 EEST Sean McGovern a écrit : > > Does wasted32 (and I guess wasted33 by proxy) not have to worry about > loops > > tails? I noticed the other vectorized versions don't do anything special > in

Re: [FFmpeg-devel] [PATCH 2/2] lavc/flacdsp: implement wasted32 DSP function for VSX on POWER

2024-07-06 Thread Rémi Denis-Courmont
Le lauantaina 6. heinäkuuta 2024, 23.00.47 EEST Sean McGovern a écrit : > Does wasted32 (and I guess wasted33 by proxy) not have to worry about loops > tails? I noticed the other vectorized versions don't do anything special in > that regard. Frankly, RISC-V vectors (like Arm SVE's) are scalable s

Re: [FFmpeg-devel] [PATCH 2/2] lavc/flacdsp: implement wasted32 DSP function for VSX on POWER

2024-07-06 Thread Sean McGovern
Hi, On Thu, Jul 4, 2024, 13:54 Rémi Denis-Courmont wrote: > Le torstaina 4. heinäkuuta 2024, 19.26.19 EEST Sean McGovern a écrit : > > Is that correlated with the comment above re: len? Or is it more general > > that I should unroll until I've exhausted the available vector registers? > > You s

Re: [FFmpeg-devel] [PATCH 2/2] lavc/flacdsp: implement wasted32 DSP function for VSX on POWER

2024-07-04 Thread Rémi Denis-Courmont
Le torstaina 4. heinäkuuta 2024, 19.26.19 EEST Sean McGovern a écrit : > Is that correlated with the comment above re: len? Or is it more general > that I should unroll until I've exhausted the available vector registers? You should unroll if it improves bandwidth. -- レミ・デニ-クールモン http://www.reml

Re: [FFmpeg-devel] [PATCH 2/2] lavc/flacdsp: implement wasted32 DSP function for VSX on POWER

2024-07-04 Thread Sean McGovern
Hi Rémi, First of all, thanks for the review. On Thu, Jul 4, 2024, 07:15 Rémi Denis-Courmont wrote: > > > Le 4 juillet 2024 04:23:30 GMT+03:00, Sean McGovern > a écrit : > >RaptorCS POWER9 (8c4t) @ 2.2GHz: > >flac_wasted_32_c: 50.1 > >flac_wasted_32_vsx: 17.3 > >--- > > libavcodec/flacdsp.c

Re: [FFmpeg-devel] [PATCH 2/2] lavc/flacdsp: implement wasted32 DSP function for VSX on POWER

2024-07-04 Thread Rémi Denis-Courmont
Le 4 juillet 2024 04:23:30 GMT+03:00, Sean McGovern a écrit : >RaptorCS POWER9 (8c4t) @ 2.2GHz: >flac_wasted_32_c: 50.1 >flac_wasted_32_vsx: 17.3 >--- > libavcodec/flacdsp.c | 2 ++ > libavcodec/flacdsp.h | 1 + > libavcodec/ppc/Makefile | 2 ++ > libavcodec/ppc/flacdsp_

[FFmpeg-devel] [PATCH 2/2] lavc/flacdsp: implement wasted32 DSP function for VSX on POWER

2024-07-03 Thread Sean McGovern
RaptorCS POWER9 (8c4t) @ 2.2GHz: flac_wasted_32_c: 50.1 flac_wasted_32_vsx: 17.3 --- libavcodec/flacdsp.c | 2 ++ libavcodec/flacdsp.h | 1 + libavcodec/ppc/Makefile | 2 ++ libavcodec/ppc/flacdsp_init.c | 38 libavcodec/ppc/flacdsp_vsx.c |