> -----Original Message----- > From: Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> > Sent: 24 August 2021 09:01 > To: gcc Patches <gcc-patches@gcc.gnu.org>; Kyrylo Tkachov > <kyrylo.tkac...@arm.com>; Richard Earnshaw > <richard.earns...@foss.arm.com> > Subject: Re: [ARM] PR66791: Replace builtin in vld1_dup intrinsics > > On Fri, 13 Aug 2021 at 16:40, Prathamesh Kulkarni > <prathamesh.kulka...@linaro.org> wrote: > > > > On Thu, 5 Aug 2021 at 15:37, Prathamesh Kulkarni > > <prathamesh.kulka...@linaro.org> wrote: > > > > > > On Thu, 29 Jul 2021 at 19:58, Prathamesh Kulkarni > > > <prathamesh.kulka...@linaro.org> wrote: > > > > > > > > Hi, > > > > The attached patch replaces builtins in vld1_dup intrinsics with call > > > > to corresponding vdup_n intrinsic and removes entry for vld1_dup from > > > > arm_neon_builtins.def. > > > > Bootstrapped+tested on arm-linux-gnueabihf. > > > > OK to commit ? > > > ping https://gcc.gnu.org/pipermail/gcc-patches/2021-July/576321.html > > ping * 2 https://gcc.gnu.org/pipermail/gcc-patches/2021-July/576321.html > ping * 3 https://gcc.gnu.org/pipermail/gcc-patches/2021-July/576321.html
Sorry for the slow response. I don't think this approach improves anything. With the current setup we'd be guaranteeing generation of the load-and-dup instruction even at low optimisation levels, but with this change we'd be relying on RTL optimisers merging the load and dup together. I don't think it gains us anything? Thanks, Kyrill > > Thanks, > Prathamesh > > > > Thanks, > > Prathamesh > > > > > > Thanks, > > > Prathamesh > > > > > > > > Thanks, > > > > Prathamesh