Prathamesh Kulkarni wrote: > This is a rebased version of patch that adds a pattern to neon.md for > implementing division with multiplication by reciprocal using > vrecpe/vrecps with -funsafe-math-optimizations excluding -Os. > The newly added test-cases are not vectorized on armeb target with > -O2. I posted the analysis for that here: > https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01765.html
I don't think doing this unconditionally for any CPU is a good idea. On AArch64 we don't enable this for any core since it's not really faster (newer CPUs have significantly improved division and the reciprocal instructions reduce throughput of other FMAs). On wrf doing reciprocal square root is far better than reciprocal division, but it's only faster on some specific CPUs, so it's not enabled by default. Wilco