https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122558
--- Comment #2 from Zhongyao Chen <chenzhongyao.hit at gmail dot com> --- the dump log shows RVVM1QI and RVVMF2QI receiving identical costs, so the cost model choose the RVVM1QI rvv_mode and we end up using RVVM2HI. I suspect this is the core issue: when the is fully unrolled, larger LMULs no longer gain any iteration-count advantage, the stmt_cost should adjust for LMUL.
