https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114109
--- Comment #3 from JuzheZhong <juzhe.zhong at rivai dot ai> --- (In reply to Robin Dapp from comment #2) > It is vectorized with a higher zvl, e.g. zvl512b, refer > https://godbolt.org/z/vbfjYn5Kd. OK. I see. But Clang generates many slide instruction which are expensive in real hardware. And also vluxei64 is also expensive. I am not sure which is better. It should be tested on real RISC-V hardware to evaluate their performance rather than simply tested on SPIKE/QEMU dynamic instructions count.