Andrew Stubbs <a...@codesourcery.com> writes:
> On 30/05/2023 07:26, Richard Biener wrote:
>> On Fri, May 26, 2023 at 4:35 PM Andrew Stubbs <a...@codesourcery.com> wrote:
>>>
>>> Hi all,
>>>
>>> I want to implement a vector DIVMOD libfunc for amdgcn, but I can't just
>>> do it because the GCC middle-end models DIVMOD's return value as
>>> "complex int" type, and there are no vector equivalents of that type.
>>>
>>> Therefore, this patch adds minimal support for "complex vector int"
>>> modes.  I have not attempted to provide any means to use these modes
>>> from C, so they're really only useful for DIVMOD.  The actual libfunc
>>> implementation will pack the data into wider vector modes manually.
>>>
>>> A knock-on effect of this is that I needed to increase the range of
>>> "mode_unit_size" (several of the vector modes supported by amdgcn exceed
>>> the previous 255-byte limit).
>>>
>>> Since this change would add a large number of new, unused modes to many
>>> architectures, I have elected to *not* enable them, by default, in
>>> machmode.def (where the other complex modes are created).  The new modes
>>> are therefore inactive on all architectures but amdgcn, for now.
>>>
>>> OK for mainline?  (I've not done a full test yet, but I will.)
>> 
>> I think it makes more sense to map vector CSImode to vector SImode with
>> the double number of lanes.  In fact since divmod is a libgcc function
>> I wonder where your vector variant would reside and how GCC decides to
>> emit calls to it?  That is, there's no way to OMP simd declare this function?
>
> The divmod implementation lives in libgcc. It's not too difficult to 
> write using vector extensions and some asm tricks. I did try an OMP simd 
> declare implementation, but it didn't vectorize well, and that's a yack 
> I don't wish to shave right now.
>
> In any case, the OMP simd declare will not help us here, directly, 
> because the DIVMOD transformation happens too late in the pass pipeline, 
> long after ifcvt and vect. My implementation (not yet posted), uses a 
> libfunc and the TARGET_EXPAND_DIVMOD_LIBFUNC hook in the standard way. 
> It just needs the complex vector modes to exist.
>
> Using vectors twice the length is problematic also. If I create a new 
> V128SImode that spans across two 64-lane vector registers then that will 
> probably have the desired effect ("real" quotient in v8, "imaginary" 
> remainder in v9), but if I use V64SImode to represent two V32SImode 
> vectors then that's a one-register mode, and I'll have to use a 
> permutation (a memory operation) to extract lanes 32-63 into lanes 0-31, 
> and if we ever want to implement instructions that operate on these 
> modes (as opposed to the odd/even add/sub complex patterns we have now) 
> then the masking will be all broken and we'd need to constantly 
> disassemble the double length vectors to operate on them.

I don't know if this helps (probably not), but we have a similar
situation on AArch64: a 64-bit mode like V8QI can be doubled to a
128-bit vector or to a pair of 64-bit vectors.  We used V16QI for
the former and "V2x8QI" for the latter.  V2x8QI is forced to come
after V16QI in the mode list, and so it is only ever used through
explicit choice.  But both modes are functionally vectors of 16 QIs.

Thanks,
Richard

Reply via email to