On Mon, Feb 11, 2019 at 04:56:45PM +0100, Uros Bizjak wrote:
> > Let's first define what MODE_XI means in standard_sse_constant_opcode
> > as well as in all these mov patterns for with and without AVX512VL.   
> > Without
> > a clear definition, we can't get out of this mess.
> 
> INT_MODE (OI, 32);
> INT_MODE (XI, 64);
> 
> So, XI_MODE represents 64 INTEGER bytes = 64 * 8 = 512 bit operation,
> in case of const_1, all 512 bits set.
> 
> We can load zeros with narrower instruction, (e.g. 256 bit by inherent
> zeroing of highpart in case of 128 bit xor), so TImode in this case.
> 
> Some targets prefer V4SF mode, so they will emit float xorps for zeroing
> 
> Then the introduction of AVX512F fubared everything by overloading the
> meaning of insn mode.

I don't see much changes in AVX512F here, most of the behavior has been
there already in AVX.
Most of the SSE/AVX/AVX512 instructions affect the whole register,
usually there is DEST[MAX_VL-1:VL] <- 0 at the end of each instruction.
But, using the MAX_VL to determine get_attr_mode doesn't seem really useful,
because that changes dynamically at runtime based on the actual hw, not on
what we've been compiled for.
So, I believe we want to use that VL value to determine the bitsize of the
mode corresponding to get_attr_mode.  And in that case, for
*movoi_internal_avx and *movti_internal, I believe the right mode is MODE_OI
resp. MODE_TI for AVX512VL, because e.g.
vmovdqa32 %ymm12, %ymm23
is a VL = 256 instruction, not VL = 512.  Similarly, if we want to set
%ymm25 to all ones, i.e. movoi_internal_avx, we use
vpternlogd      $0xFF, %ymm25, %ymm25, %ymm25
which is again VL = 256 instruction, so should use MODE_OI.
We'd need to use
vmovdqa32 %zmm12, %zmm23
or
vpternlogd      $0xFF, %zmm25, %zmm25, %zmm25
instructions for AVX512F without AVX512VL, but as has been discussed, this
won't really happen, because hard_regno_mode_ok refuses to allocate 256-bit
or 128-bit modes in ext sse registers.

        Jakub

Reply via email to