Sorry I am not taking the time to try out the change for you and see whether it makes the desired difference in performance happen, but I do know that the ^double type hint on return values should be on the argument vector, not on the var. So instead of your abs-diff, try this, and similarly for any other functions with double return value:
(defn abs-diff ^double [^double x ^double y] (if (p/> x y) (p/- x y) (p/- y x))) Andy On Sun, Apr 6, 2014 at 9:44 AM, Dmitry Groshev <lambdadmi...@gmail.com>wrote: > For some time I had a suspicion that in Clojure we have a fundamental > problem with efficient math and today I've tried to test it. > > Let's say we have a pretty simple function: > > (ns testapp.core > (:require > [criterium.core :as cr] > [primitive-math :as p] > [no.disassemble :as nd])) > > (set! *warn-on-reflection* true) > > (defn ^double test1 [a b] > (let [^doubles a a > ^doubles b b] > (areduce a i ret (double 0) > (p/+ ret > (let [a-x (aget a i) > b-x (aget b i)] > (if (p/> a-x b-x) > (p/- a-x b-x) > (p/- b-x a-x))))))) > > Let's test it's performance (all tests were performed in Clojure 1.5.1 on > top of latest JVM7 and Sandy Bridge i5): > > (defn bench1 [] > (let [a (double-array (range 1 100000)) > b (double-array (range 100000 1 -1))] > (cr/quick-bench (test1 a b)))) > > Result: > > Execution time mean : 223.076094 µs > Execution time std-deviation : 21.413850 µs > > It's around 1 nanosecond per double operation, so it looks adequate to me. > Disassemble looks fine to me, too. > > Now let's refactor the code a little, that compare-and-diff piece looks > generally useful: > > (defn ^double abs-diff [^double x ^double y] > (if (p/> x y) (p/- x y) (p/- y x))) > > (defn ^double test2 [a b] > (let [^doubles a a > ^doubles b b] > (areduce a i ret (double 0) > (p/+ ret (double (abs-diff (aget a i) (aget b i))))))) > > (defn bench2 [] > (let [a (double-array (range 1 100000)) > b (double-array (range 100000 1 -1))] > (cr/quick-bench (test2 a b)))) > > Suddenly: > > Execution time mean : 877.690451 µs > Execution time std-deviation : 15.281713 µs > > Why is it so? Disassemble reveals a few inefficiencies here and there. > Here is an interesting bit of test2's invoke: > > 32 lload 6 [i] > 34 lconst_1 > 35 ladd > 36 dload 8 [ret] > 38 getstatic testapp.core$test2.const__5 : clojure.lang.Var [57] > 41 invokevirtual clojure.lang.Var.getRawRoot() : java.lang.Object [76] > 44 checkcast clojure.lang.IFn$DDO [78] > 47 aload_3 [a] > 48 checkcast double[] [72] > 51 lload 6 [i] > 53 invokestatic clojure.lang.RT.intCast(long) : int [82] > 56 daload > 57 aload 4 [b] > 59 checkcast double[] [72] > 62 lload 6 [i] > 64 invokestatic clojure.lang.RT.intCast(long) : int [82] > 67 daload > 68 invokeinterface clojure.lang.IFn$DDO.invokePrim(double, double) : > java.lang.Object [86] [nargs: 5] > 73 invokestatic clojure.lang.RT.doubleCast(java.lang.Object) : double > [90] > 76 invokestatic primitive_math.Primitives.add(double, double) : > double [96] > 79 dstore 8 [ret] > 81 lstore 6 [i] > 83 goto 19 > 86 goto 95 > > If I understand this correctly, in this piece of code we have following > problems: > 1) there are some unnecessary casts from long to int. Probably this is > caused by areduce macro (it uses "0" as a counter and it's long in Clojure, > but arrays are indexed with ints). Replacing it with a loop doesn't help > much, maybe JIT optimizes this already. > 2) every call to abs-diff goes through Var dereference. > 3) abs-diff returns Object despite the ^double return annotation (this is > why (double ...) cast is needed there to avoid reflection warning). > Therefore we can't avoid boxing-unboxing here. > > Is there a way to use small, reusable, composable pieces in math-heavy hot > loops? We can use macros instead of functions and compile everything into > one big method. This is the way HipHip(array) of Prismatic works. The > problem here is that their examples are tiny and bigger stuff like > decompositions will result in very large compiled methods. This is a bad > practice in Java-land, because in general JVM doesn't like such methods, > skipping their inlining and delaying optimization. Moreover, macros have > composability problems, too. Therefore, macros are not the ultimate > solution here. > > So, what's the best way to write beautiful AND performant numerical code > in Clojure? > > The full code for this email can be found here: > https://gist.github.com/si14/10008500 > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.