Hi,
On Sat, Jul 6, 2024, 16:18 Rémi Denis-Courmont wrote:
> Le lauantaina 6. heinäkuuta 2024, 23.00.47 EEST Sean McGovern a écrit :
> > Does wasted32 (and I guess wasted33 by proxy) not have to worry about
> loops
> > tails? I noticed the other vectorized versions don't do anything special
> in
Le lauantaina 6. heinäkuuta 2024, 23.00.47 EEST Sean McGovern a écrit :
> Does wasted32 (and I guess wasted33 by proxy) not have to worry about loops
> tails? I noticed the other vectorized versions don't do anything special in
> that regard.
Frankly, RISC-V vectors (like Arm SVE's) are scalable s
Hi,
On Thu, Jul 4, 2024, 13:54 Rémi Denis-Courmont wrote:
> Le torstaina 4. heinäkuuta 2024, 19.26.19 EEST Sean McGovern a écrit :
> > Is that correlated with the comment above re: len? Or is it more general
> > that I should unroll until I've exhausted the available vector registers?
>
> You s
Le torstaina 4. heinäkuuta 2024, 19.26.19 EEST Sean McGovern a écrit :
> Is that correlated with the comment above re: len? Or is it more general
> that I should unroll until I've exhausted the available vector registers?
You should unroll if it improves bandwidth.
--
レミ・デニ-クールモン
http://www.reml
Hi Rémi,
First of all, thanks for the review.
On Thu, Jul 4, 2024, 07:15 Rémi Denis-Courmont wrote:
>
>
> Le 4 juillet 2024 04:23:30 GMT+03:00, Sean McGovern
> a écrit :
> >RaptorCS POWER9 (8c4t) @ 2.2GHz:
> >flac_wasted_32_c: 50.1
> >flac_wasted_32_vsx: 17.3
> >---
> > libavcodec/flacdsp.c
Le 4 juillet 2024 04:23:30 GMT+03:00, Sean McGovern a
écrit :
>RaptorCS POWER9 (8c4t) @ 2.2GHz:
>flac_wasted_32_c: 50.1
>flac_wasted_32_vsx: 17.3
>---
> libavcodec/flacdsp.c | 2 ++
> libavcodec/flacdsp.h | 1 +
> libavcodec/ppc/Makefile | 2 ++
> libavcodec/ppc/flacdsp_
RaptorCS POWER9 (8c4t) @ 2.2GHz:
flac_wasted_32_c: 50.1
flac_wasted_32_vsx: 17.3
---
libavcodec/flacdsp.c | 2 ++
libavcodec/flacdsp.h | 1 +
libavcodec/ppc/Makefile | 2 ++
libavcodec/ppc/flacdsp_init.c | 38
libavcodec/ppc/flacdsp_vsx.c |