The following patch aims at speeding up the emulation of whole register loads/stores by generating tcg operations rather then going through the call of a helper function. The proposed implementation aims at using atomic 16 byte loads and store when possible and it updates the value of vstart in order to keep the state of the cpu consistent. This kind of precaution doesn't seem to be followed by other vector operations that use tcg ops generation so this might be redundant. Also the atomicity requirements of qemu loads and stores get removed if we are running in serial mode (!CF_PARALLEL). In light of this I wonder whether exceptions could be a concern or not in the context of tcg ops generation, above all when it comes to updating the state of the cpu consistently (vstart). Any feedback welcome.
The proposed implementation aims at replacing the correspondent helper function, that will be removed in the final version of the patch, unless there are corner cases where it is still necessary to use it. Cc: Richard Handerson <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Alistair Francis <[email protected]> Cc: Bin Meng <[email protected]> Cc: Weiwei Li <[email protected]> Cc: Daniel Henrique Barboza <[email protected]> Cc: Liu Zhiwei <[email protected]> Cc: Helene Chelin <[email protected]> Cc: Nathan Egge <[email protected]> Cc: Max Chou <[email protected]> Cc: Jeremy Bennett <[email protected]> Cc: Craig Blackmore <[email protected]> Paolo Savini (1): target/riscv: use tcg ops generation to emulate whole reg rvv loads/stores. target/riscv/insn_trans/trans_rvv.c.inc | 104 +++++++++++++----------- 1 file changed, 56 insertions(+), 48 deletions(-) -- 2.34.1
