> -----Original Message-----
> From: Richard Sandiford <richard.sandif...@arm.com>
> Sent: 09 July 2021 12:40
> To: Jonathan Wright <jonathan.wri...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; Kyrylo Tkachov <kyrylo.tkac...@arm.com>
> Subject: Re: [PATCH] aarch64: Use unions for vector tables in vqtbl[234]
> intrinsics
> 
> Jonathan Wright <jonathan.wri...@arm.com> writes:
> > Hi,
> >
> > As subject, this patch uses a union instead of constructing a new opaque
> > vector structure for each of the vqtbl[234] Neon intrinsics in arm_neon.h.
> > This simplifies the header file and also improves code generation -
> > superfluous move instructions were emitted for every register
> > extraction/set in this additional structure.
> >
> > This change is safe because the C-level vector structure types e.g.
> > uint8x16x4_t already provide a tie for sequential register allocation
> > - which is required by the TBL instructions.
> >
> > Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> > issues.
> >
> > Ok for master?
> 
> Looks good, but I think we should have some tests to defend the
> RA improvements.  E.g. have things like:
> 
>   #include <arm_neon.h>
> 
>   …
> 
>   uint8x8_t
>   f2_u8 (uint8x16x2_t x, uint8x8_t y)
>   {
>     return vqtbl2_u8 (x, y);
>   }
> 
>   …
> 
> and add a scan-assembler-not for moves.
> 
> Union punning is UB for standard C++, but I think in practice we're
> not going to be able to treat it as such for GCC.  This would be
> far from the only thing to rely on union punning for correctness.

Could we use some reinterpret_cast or bit_cast (or the builtins the rely on) 
for C++?
This may involve separate definitions for C and C++, which may not be worth 
it... No objections to the patch from me to be clear.
Thanks,
Kyrill

> 
> Thanks,
> Richard

Reply via email to