https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121867
--- Comment #1 from Jeevitha <jeevitha at gcc dot gnu.org> --- The modulo reduction for shift amounts in AltiVec’s vec_sl is already implemented in the GIMPLE folding pass for PowerPC within the rs6000_gimple_fold_builtin function. However, this folding is restricted by a type check that excludes non-overflow-wrapping types: if (INTEGRAL_TYPE_P (TREE_TYPE (arg0_type)) && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg0_type))) return false; This check prevents signed types for the first argument (arg0, the vector to be shifted) from reaching the modulo reduction logic. For unsigned types, the folding should apply the modulo reduction, as shown below: _1 = { 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35 }; _2 = _1 % { 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8 }; D.4059 = in << _2; return D.4059; However, in our case, even though arg0 is a vector unsigned char (which satisfies TYPE_OVERFLOW_WRAPS), the modulo reduction is not applied. This is because the input vector 'in' is unexpectedly cast to vector signed char in the GIMPLE representation, as shown below: { # DEBUG BEGIN STMT; return VIEW_CONVERT_EXPR<__vector unsigned char>( __builtin_altivec_vslb( VIEW_CONVERT_EXPR<__vector signed char>(in), {35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35})); } The root cause lies in how the AltiVec built-ins are defined in rs6000-builtins.def. The prototype for vslb is defined as: const vsc __builtin_altivec_vslb (vsc, vuc); VSLB vashlv16qi3 {} Here, vsc (vector signed char) is used for the first argument, while vuc (vector unsigned char) is used for the second (shift amount). Despite overloads defined in rs6000-overload.def for unsigned cases: [VEC_SL, vec_sl, __builtin_vec_sl] vsc __builtin_vec_sl (vsc, vuc); VSLB VSLB_VSC vuc __builtin_vec_sl (vuc, vuc); VSLB VSLB_VUC GCC prioritizes the rs6000-builtins.def definition, which casts the input (in) to vector signed char when processing __builtin_vec_sl. As a result, the unsigned overload (vuc) is ignored, the type check in rs6000_gimple_fold_builtin fails, and the modulo optimization is not invoked.