https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111048
--- Comment #8 from prathamesh3492 at gcc dot gnu.org --- (In reply to rsand...@gcc.gnu.org from comment #7) > = ((q1 & 0) == 0) ? VECTOR_CST_NPATTERNS (arg0) > : VECTOR_CST_NPATTERNS (arg1); > > should be q1 & 1 :) Oops, sorry for the typo :/ And yes, that fixes the issue. For more context we have following inputs to VEC_PERM_EXPR: arg0 (1, 1): { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } arg1: (4, 1): { 255, 63, 15, 3, 255, 63, 15, 3, 255, 63, 15, 3, 255, 63, 15, 3 } sel (2, 3): { 0, 16, 1, 17, 2, 18, ... } arg0 len: 16 sel nelts: 16 In valid_mask_for_fold_vec_perm_cst_p for the pattern: {16, 17, 18, ...} arg_npatterns is erroneously set to VECTOR_CST_NPATTERNS (arg0) and we have: step = 1, arg_npatterns = 1 Thus, step becomes a "multiple" of arg_npatterns and we (wrongly) return true for this case. So in the loop below in fold_vec_perm_cst, we have res with following encoding: res (4, 3): { 0, 255, 0, 63, 0, 15, 0, 3, 0, 255, 0, 63, ... } Since len = 16, it has to compute the remaining elements. For index 13, it comes as "a3" in pattern: { 255, 15, 255, ... } So the step gets computed as: 255 - 15 = 240 And IIUC the next element thus becomes: (255 + 240)%256 = 239. By correctly setting arg_npatterns = VECTOR_CST_NPATTERNS (arg1) for this case, arg_npatterns becomes 4. Since step == 1 is not a multiple of arg_npatterns we return false, and use the fallback: res_npatterns = 16, res_nelts_per_pattern = 1. and the loop below correctly encodes the elements. I will shortly send a patch after validating it. Thanks, Prathamesh