https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194
--- Comment #21 from rguenther at suse dot de <rguenther at suse dot de> --- On Fri, 16 Oct 2020, crazylht at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194 > > --- Comment #20 from Hongtao.liu <crazylht at gmail dot com> --- > (In reply to Richard Biener from comment #18) > > (In reply to Hongtao.liu from comment #16) > > > (In reply to Hongtao.liu from comment #15) > > > > I'm working on add the expander, i encounter a problem. > > > > > > > > for V32HI vec_set with constant index, the expander existed under > > > > TARGET_AVX512F, but for variable index, the expander should be existed > > > > under > > > > TARGET_AVX512BW, since vpcmpw zmm only existed in TARGET_AVX512BW, we > > > > need > > > > to restricted the expander under TARGET_AVX512BW, unfortunately > > > > operands is > > > > unvisible in condition scope, any solution to handle such issue? > > > > > > Or break V32HI into V16HI and V8HI when TARGET_AVX512BW is not existed. > > > > That sounds like a good implementation strathegy, I suppose AVX512F implies > > AVX2 even when AVX512VL is not available. > > Yes, but not sure performance impact, maybe better generate code "like" > expander not existed. I'm not so sure - the STLF penalty is quite large and the AVX2 code should only require an extra vextract + vpcmp + vinsert (and the extra vpcmp uses another constant pool entry). The broadcasts and blend only require AVX512F.