https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194

--- Comment #21 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 16 Oct 2020, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194
> 
> --- Comment #20 from Hongtao.liu <crazylht at gmail dot com> ---
> (In reply to Richard Biener from comment #18)
> > (In reply to Hongtao.liu from comment #16)
> > > (In reply to Hongtao.liu from comment #15)
> > > > I'm working on add the expander, i encounter a problem.
> > > > 
> > > > for V32HI vec_set with constant index, the expander existed under
> > > > TARGET_AVX512F, but for variable index, the expander should be existed 
> > > > under
> > > > TARGET_AVX512BW, since vpcmpw zmm only existed in TARGET_AVX512BW, we 
> > > > need
> > > > to restricted the expander under TARGET_AVX512BW, unfortunately 
> > > > operands is
> > > > unvisible in condition scope, any solution to handle such issue?
> > > 
> > > Or break V32HI into V16HI and V8HI when TARGET_AVX512BW is not existed.
> > 
> > That sounds like a good implementation strathegy, I suppose AVX512F implies
> > AVX2 even when AVX512VL is not available.
> 
> Yes, but not sure performance impact, maybe better generate code "like"
> expander not existed.

I'm not so sure - the STLF penalty is quite large and the AVX2 code
should only require an extra vextract + vpcmp + vinsert (and the
extra vpcmp uses another constant pool entry).  The broadcasts
and blend only require AVX512F.

Reply via email to