Hi Prathamesh,

This is probably not appropriate for -Os optimisation.
And for speed optimisation I imagine it can vary a lot on the target the code 
is run.
Do you have any benchmark results for this patch?

Thanks,
Kyrill

On 29/07/15 11:09, Prathamesh Kulkarni wrote:
Hi,
This patch tries to implement division with multiplication by
reciprocal using vrecpe/vrecps
with -funsafe-math-optimizations and -freciprocal-math enabled.
Tested on arm-none-linux-gnueabihf using qemu.
OK for trunk ?

Thank you,
Prathamesh
+    /* Perform 2 iterations of Newton-Raphson method for better accuracy */
+    for (int i = 0; i < 2; i++)
+      {
+    emit_insn (gen_neon_vrecps<mode> (vrecps_temp, rec, operands[2]));
+    emit_insn (gen_mul<mode>3 (rec, rec, vrecps_temp));
+      }
+
+    /* We now have reciprocal in rec, perform operands[0] = operands[1] * rec 
*/
+    emit_insn (gen_mul<mode>3 (operands[0], operands[1], rec));
+    DONE;
+  }
+)
+

Full stop and two spaces at the end of the comments.

Reply via email to