> -----Original Message-----
> From: Prathamesh Kulkarni <prathamesh.kulka...@linaro.org>
> Sent: 24 August 2021 09:01
> To: gcc Patches <gcc-patches@gcc.gnu.org>; Kyrylo Tkachov
> <kyrylo.tkac...@arm.com>; Richard Earnshaw
> <richard.earns...@foss.arm.com>
> Subject: Re: [ARM] PR66791: Replace builtin in vld1_dup intrinsics
> 
> On Fri, 13 Aug 2021 at 16:40, Prathamesh Kulkarni
> <prathamesh.kulka...@linaro.org> wrote:
> >
> > On Thu, 5 Aug 2021 at 15:37, Prathamesh Kulkarni
> > <prathamesh.kulka...@linaro.org> wrote:
> > >
> > > On Thu, 29 Jul 2021 at 19:58, Prathamesh Kulkarni
> > > <prathamesh.kulka...@linaro.org> wrote:
> > > >
> > > > Hi,
> > > > The attached patch replaces builtins in vld1_dup intrinsics with call
> > > > to corresponding vdup_n intrinsic and removes entry for vld1_dup from
> > > > arm_neon_builtins.def.
> > > > Bootstrapped+tested on arm-linux-gnueabihf.
> > > > OK to commit ?
> > > ping https://gcc.gnu.org/pipermail/gcc-patches/2021-July/576321.html
> > ping * 2 https://gcc.gnu.org/pipermail/gcc-patches/2021-July/576321.html
> ping * 3 https://gcc.gnu.org/pipermail/gcc-patches/2021-July/576321.html

Sorry for the slow response.
I don't think this approach improves anything. With the current setup we'd be 
guaranteeing generation of the load-and-dup instruction even at low 
optimisation levels, but with this change we'd be relying on RTL optimisers 
merging the load and dup together. I don't think it gains us anything?

Thanks,
Kyrill

> 
> Thanks,
> Prathamesh
> >
> > Thanks,
> > Prathamesh
> > >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > Thanks,
> > > > Prathamesh

Reply via email to