Hi H.J.,

Our LLVM implementation currently use %xmm0 for both _Complex's real part and 
imaginary part. Do we have special reason to use two registers?
We are using one register on X64. Considering the performance, especially the 
register pressure, should it be better to use one register for _Complex 
_Float16 on 32 bits target?

Thanks
Pengfei

-----Original Message-----
From: H.J. Lu <hjl.to...@gmail.com> 
Sent: Tuesday, July 13, 2021 10:26 PM
To: Wang, Pengfei <pengfei.w...@intel.com>; llvm-...@lists.llvm.org
Cc: Joseph Myers <jos...@codesourcery.com>; GCC Patches 
<gcc-patches@gcc.gnu.org>; GNU C Library <libc-al...@sourceware.org>; IA32 
System V Application Binary Interface <ia32-...@googlegroups.com>
Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support

On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei <pengfei.w...@intel.com> wrote:
>
> > Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
>
> Can you please explain the behavior here? Is there difference between 
> _Float16 and _Complex _Float16 when return? I.e., 1, In which case will 
> _Float16 values return in both %xmm0 and %xmm1?
> 2, For a single _Float16 value, are both real part and imaginary part 
> returned in %xmm0? Or returned in %xmm0 and %xmm1 respectively?

Here is the v2 patch to add the missing _Float16 bits.   The PDF file is at

https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI

> Thanks
> Pengfei
>
> -----Original Message-----
> From: llvm-dev <llvm-dev-boun...@lists.llvm.org> On Behalf Of H.J. Lu 
> via llvm-dev
> Sent: Friday, July 2, 2021 6:28 AM
> To: Joseph Myers <jos...@codesourcery.com>
> Cc: llvm-...@lists.llvm.org; GCC Patches <gcc-patches@gcc.gnu.org>; 
> GNU C Library <libc-al...@sourceware.org>; IA32 System V Application 
> Binary Interface <ia32-...@googlegroups.com>
> Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support
>
> On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers <jos...@codesourcery.com> wrote:
> >
> > On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote:
> >
> > > 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
> >
> > That restricts use of _Float16 to processors with SSE.  Is that what 
> > we want in the ABI, or should _Float16 be available with base 32-bit
> > x86 architecture features only, much like _Float128 and the decimal 
> > FP types
>
> Yes, _Float16 requires XMM registers.
>
> > are?  (If it is restricted to SSE, we can of course ensure relevant 
> > libgcc functions are built with SSE enabled, and likewise in glibc 
> > if that gains
> > _Float16 functions, though maybe with some extra complications to 
> > get relevant testcases to run whenever possible.)
> >
>
> _Float16 functions in libgcc should be compiled with SSE enabled.
>
> BTW, _Float16 software emulation may require more than just SSE since we need 
> to do _Float16 load and store with XMM registers.
> There is no 16bit load/store for XMM registers without AVX512FP16.
>
> --
> H.J.
> _______________________________________________
> LLVM Developers mailing list
> llvm-...@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



--
H.J.

Reply via email to