https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110214
Bug ID: 110214 Summary: x86 backend lacks support for vec_pack_ssat_m and vec_pack_usat_m Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: crazylht at gmail dot com Target Milestone: --- This is from PR108410: > and the key thing to optimize is > > ivtmp_78 = ivtmp_77 + 4294967232; // -64 > _79 = MIN_EXPR <ivtmp_78, 255>; > _80 = (unsigned char) _79; > _81 = {_80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, > _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, > _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, > _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, > _80, _80, _80, _80, _80, _80}; > > that is we want to broadcast a saturated (to vector element precision) value. Yes, backend needs to support vec_pack_ssat_m, vec_pack_usat_m. But I didn't find optab for ss_truncate or us_truncate which might be used by BB vectorizer. AVX512 support vpmov{u,}sqd, vpmov{u,}sdw, vpmov{u,}swb for demotion with signed/unsigned saturation.