Thanks for the review Richard.
On 10/30/24 11:40, Richard Henderson wrote:
On 10/29/24 19:43, Paolo Savini wrote:
This patch optimizes the emulation of unit-stride load/store RVV
instructions
when the data being loaded/stored per iteration amounts to 16 bytes
or more.
The optimization
Henrique Barboza
Cc: Liu Zhiwei
Cc: Helene Chelin
Cc: Nathan Egge
Cc: Max Chou
Helene CHELIN (1):
target/riscv: rvv: reduce the overhead for simple RISC-V vector
unit-stride loads and stores
Paolo Savini (1):
target/riscv: rvv: improve performance of RISC-V vector loads and
stores on
and the
destination memory address and vice versa.
This is done only if we have direct access to the RAM of the host machine,
if the host is little endiand and if it supports atomic 128 bit memory
operations.
Signed-off-by: Paolo Savini
---
target/riscv/vector_helper.c| 17
sters (LMUL=1).
The optimization consists of avoiding the overhead of probing the RAM of the
host machine and doing a loop load/store on the input data grouped in chunks
of as many bytes as possible (8,4,2,1 bytes).
Co-authored-by: Helene CHELIN
Co-authored-by: Paolo Savini
Signed-off-by: Helene C
f the
vector registers (LMUL=1).
The optimization consists of avoiding the overhead of probing the RAM of the
host machine and doing a loop load/store on the input data grouped in chunks
of as many bytes as possible (8,4,2,1 bytes).
Co-authored-by: Helene CHELIN
Co-authored-by: Paolo Savini
S
register and the
destination memory address and vice versa.
This is done only if we have direct access to the RAM of the host machine,
if the host is little endiand and if it supports atomic 128 bit memory
operations.
Signed-off-by: Paolo Savini
---
target/riscv/vector_helper.c | 14 +-
1
Cc: Palmer Dabbelt
Cc: Alistair Francis
Cc: Bin Meng
Cc: Weiwei Li
Cc: Daniel Henrique Barboza
Cc: Liu Zhiwei
Cc: Helene Chelin
Cc: Nathan Egge
Cc: Max Chou
Helene CHELIN (1):
target/riscv: rvv: reduce the overhead for simple RISC-V vector
unit-stride loads and stores
Paolo Savini
The simplified emulation of vector loads and stores that bypasses the memory
probing in the vext_ldst_us helper function seem to benefit only the user mode.
We therefore limit this approach to the user mode configuration.
Signed-off-by: Paolo Savini
---
target/riscv/vector_helper.c | 3 ++-
1
f the
vector registers (LMUL=1).
The optimization consists of avoiding the overhead of probing the RAM of the
host machine and doing a loop load/store on the input data grouped in chunks
of as many bytes as possible (8,4,2,1 bytes).
Co-authored-by: Helene CHELIN
Co-authored-by: Paolo Savini
S
Meng
Cc: Weiwei Li
Cc: Daniel Henrique Barboza
Cc: Liu Zhiwei
Cc: Helene Chelin
Cc: Nathan Egge
Cc: Max Chou
Helene CHELIN (1):
target/riscv: rvv: reduce the overhead for simple RISC-V vector
unit-stride loads and stores
Paolo Savini (1):
target/riscv: use a simplified loop to emulate
The simplified emulation of vector loads and stores that bypasses the memory
probing in the vext_ldst_us helper function seem to benefit only the user mode.
We therefore limit this approach to the user mode configuration.
Signed-off-by: Paolo Savini
---
target/riscv/vector_helper.c | 3 ++-
1
load/store loop for small
vector and data sizes when QEMU is in system mode.
Cc: Richard Handerson
Cc: Palmer Dabbelt
Cc: Alistair Francis
Cc: Bin Meng
Cc: Weiwei Li
Cc: Daniel Henrique Barboza
Cc: Liu Zhiwei
Cc: Helene Chelin
Cc: Nathan Egge
Cc: Max Chou
Paolo Savini (1):
target/riscv
Thanks for the feedback Richard, I'm working on the endianness. Could
you please give me more details about the atomicity issues you are
referring to?
Best wishes
Paolo
On 7/27/24 08:15, Richard Henderson wrote:
On 7/18/24 01:30, Paolo Savini wrote:
This patch optimizes the emulati
f the
vector registers (LMUL=1).
The optimization consists of avoiding the overhead of probing the RAM of the
host machine and doing a loop load/store on the input data grouped in chunks
of as many bytes as possible (8,4,2,1 bytes).
Co-authored-by: Helene CHELIN
Co-authored-by: Paolo Savini
S
erhead for simple RISC-V vector
unit-stride loads and stores
Paolo Savini (1):
target/riscv: rvv: improve performance of RISC-V vector loads and
stores on large amounts of data.
target/riscv/vector_helper.c | 63 +++-
1 file changed, 62 insertions(+),
register and
the destination memory address and vice versa.
This is done only if we have direct access to the RAM of the host machine.
Signed-off-by: Paolo Savini
---
target/riscv/vector_helper.c | 17 -
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/target/riscv
16 matches
Mail list logo