On Thu, Dec 19, 2024 at 12:15 AM Craig Blackmore
<[email protected]> wrote:
>
> Changes since v7:
> - Fixed typo `bits` -> `bytes`
> - Tuned threshold for applying the optimization
> - Provided results for larger sizes requested by Max Chou
>
> This patch provides up to 60% speedup on the `memcpy` benchmark from:
>
>   
> https://github.com/embecosm/rise-rvv-tcg-qemu-tooling/tree/main/strmem-benchmarks
>
> There is some variation in the measurements so results are attached for six 
> runs on a single thread on an Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz.
>
> The three graphs are:
>
>   memcpy-594c0cb1ab-128-speedup.pdf: VLEN 128
>
>   memcpy-594c0cb1ab-1024-speedup.pdf: VLEN 1024
>
>   memcpy-594c0cb1ab-stdlib-speedup.pdf: Scalar (to further illustrate 
> measurement variation as this version will not touch the function modified by 
> this patch)
>
> Previous versions:
> - v1: 
> https://lore.kernel.org/all/[email protected]/
> - v2: 
> https://lore.kernel.org/all/[email protected]/
> - v3: 
> https://lore.kernel.org/all/[email protected]/
> - v4: 
> https://lore.kernel.org/all/[email protected]/
> - v5: 
> https://lore.kernel.org/all/[email protected]/
> - v6: 
> https://lore.kernel.org/all/[email protected]/
> - v7: 
> https://lore.kernel.org/all/[email protected]/
>
> Cc: Richard Henderson <[email protected]>
> Cc: Palmer Dabbelt <[email protected]>
> Cc: Alistair Francis <[email protected]>
> Cc: Bin Meng <[email protected]>
> Cc: Weiwei Li <[email protected]>
> Cc: Daniel Henrique Barboza <[email protected]>
> Cc: Liu Zhiwei <[email protected]>
> Cc: Helene Chelin <[email protected]>
> Cc: Nathan Egge <[email protected]>
> Cc: Max Chou <[email protected]>
> Cc: Paolo Savini <[email protected]>
>
> Craig Blackmore (2):
>   target/riscv: rvv: fix typo in vext continuous ldst function names
>   target/riscv: rvv: speed up small unit-stride loads and stores

Thanks!

Applied to riscv-to-apply.next

Alistair

>
>  target/riscv/vector_helper.c | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)
>
> --
> 2.43.0
>
>

Reply via email to