Hi Prathamesh, This is probably not appropriate for -Os optimisation. And for speed optimisation I imagine it can vary a lot on the target the code is run. Do you have any benchmark results for this patch?
Thanks, Kyrill On 29/07/15 11:09, Prathamesh Kulkarni wrote:
Hi, This patch tries to implement division with multiplication by reciprocal using vrecpe/vrecps with -funsafe-math-optimizations and -freciprocal-math enabled. Tested on arm-none-linux-gnueabihf using qemu. OK for trunk ? Thank you, Prathamesh
+ /* Perform 2 iterations of Newton-Raphson method for better accuracy */ + for (int i = 0; i < 2; i++) + { + emit_insn (gen_neon_vrecps<mode> (vrecps_temp, rec, operands[2])); + emit_insn (gen_mul<mode>3 (rec, rec, vrecps_temp)); + } + + /* We now have reciprocal in rec, perform operands[0] = operands[1] * rec */ + emit_insn (gen_mul<mode>3 (operands[0], operands[1], rec)); + DONE; + } +) + Full stop and two spaces at the end of the comments.