https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751
--- Comment #5 from JuzheZhong <juzhe.zhong at rivai dot ai> --- (In reply to Andrew Pinski from comment #4) > The issue for aarch64 with SVE is that MASK_LOAD is not optimized: > > ic = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"; > ib = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"; > vect__1.7_9 = .MASK_LOAD (&ib, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1, > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > ... }); > vect__2.10_35 = .MASK_LOAD (&ic, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1, > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > ... }); I don't ARM SVE has issues ... If we can choose fixed length vector mode to vectorize it, it will be well optimized. I think this is RISC-V target dependent issue.