Hi RĂ©mi,
Thanks for your reply.
> It was faster on what the best approximation of real hardware available at
> the time, i.e. a Sipeed Lichee Pi4A board. There are no benchmarks in the
> commit because I don't like to publish benchmarks collected from prototypes.
> Nevertheless I think the
Hi,
Commit 446b0090cbb66ee614dcf6ca79c78dc8eb7f0e37 by Remi Denis-Courmont has
replaced RISC-V vector loads and stores with negative stride with vrgather
(generalized permutation within vector registers) instructions in order to
reverse the elements in a vector register. The commit message