Hi Prathamesh,
This is probably not appropriate for -Os optimisation.
And for speed optimisation I imagine it can vary a lot on the target the code
is run.
Do you have any benchmark results for this patch?
Thanks,
Kyrill
On 29/07/15 11:09, Prathamesh Kulkarni wrote:
Hi,
This patch tries to implement division with multiplication by
reciprocal using vrecpe/vrecps
with -funsafe-math-optimizations and -freciprocal-math enabled.
Tested on arm-none-linux-gnueabihf using qemu.
OK for trunk ?
Thank you,
Prathamesh
+ /* Perform 2 iterations of Newton-Raphson method for better accuracy */
+ for (int i = 0; i < 2; i++)
+ {
+ emit_insn (gen_neon_vrecps<mode> (vrecps_temp, rec, operands[2]));
+ emit_insn (gen_mul<mode>3 (rec, rec, vrecps_temp));
+ }
+
+ /* We now have reciprocal in rec, perform operands[0] = operands[1] * rec
*/
+ emit_insn (gen_mul<mode>3 (operands[0], operands[1], rec));
+ DONE;
+ }
+)
+
Full stop and two spaces at the end of the comments.