Re: [FFmpeg-devel] [PATCH 6/6] lavc/takdsp: R-V V decorrelate_sm

2023-12-22 Thread Rémi Denis-Courmont
Le perjantaina 22. joulukuuta 2023, 3.34.39 EET flow gg a écrit : > func ff_decorrelate_sm_rvv, zve32x > 1: > vsetvli t0, a2, e32, m8, ta, ma > vle32.v v8, (a1) > sub a2, a2, t0 > vle32.v v0, (a0) > vssra.vi v8, v8, 1 > vsub.vv v16, v0, v8 >

Re: [FFmpeg-devel] [PATCH 6/6] lavc/takdsp: R-V V decorrelate_sm

2023-12-21 Thread flow gg
func ff_decorrelate_sm_rvv, zve32x 1: vsetvli t0, a2, e32, m8, ta, ma vle32.v v8, (a1) sub a2, a2, t0 vle32.v v0, (a0) vssra.vi v8, v8, 1 vsub.vv v16, v0, v8 vse32.v v16, (a0) sh2add a0, t0, a0 vadd.vv v16, v0, v8

Re: [FFmpeg-devel] [PATCH 6/6] lavc/takdsp: R-V V decorrelate_sm

2023-12-21 Thread Rémi Denis-Courmont
Le torstaina 21. joulukuuta 2023, 18.07.55 EET Rémi Denis-Courmont a écrit : > You can use VSSRA, and then VADD won't need to depend on the output of VSUB. P.S.: I have NOT checked which approach is actually faster. -- Rémi Denis-Courmont http://www.remlab.net/ ___

Re: [FFmpeg-devel] [PATCH 6/6] lavc/takdsp: R-V V decorrelate_sm

2023-12-21 Thread Rémi Denis-Courmont
Le maanantaina 18. joulukuuta 2023, 17.16.27 EET flow gg a écrit : > C908: > decorrelate_sm_c: 130.0 > decorrelate_sm_rvv_i32: 43.7 + +func ff_decorrelate_sm_rvv, zve32x +1: +vsetvli t0, a2, e32, m8, ta, ma +vle32.v v0, (a0) +sub a2, a2, t0 +vle32.v v8, (a1) +

[FFmpeg-devel] [PATCH 6/6] lavc/takdsp: R-V V decorrelate_sm

2023-12-18 Thread flow gg
C908: decorrelate_sm_c: 130.0 decorrelate_sm_rvv_i32: 43.7 From 3dc613feaa6c38a7df47a3fc385e2140716e0ae2 Mon Sep 17 00:00:00 2001 From: sunyuechi Date: Mon, 18 Dec 2023 22:53:39 +0800 Subject: [PATCH 6/6] lavc/takdsp: R-V V decorrelate_sm C908: decorrelate_sm_c: 130.0 decorrelate_sm_rvv_i32: 43.7