I'm glad you think partition is the problem, because that was my guess too. But I think I have the answer. This is the fastest version I've seen so far:
(defn gaussian-matrix-final [L] (into-array ^doubles (map double-array (repeat L (repeatedly L next- gaussian))))) If I understand what's going on now, then it looks like the only way to make this any faster is if next-gaussian could return primitives. The for, and doseq macros seems like they're pretty slow. -Ranjit On Sep 20, 3:30 pm, Jason Wolfe <jawo...@berkeley.edu> wrote: > I think partition is slowing you down (but haven't profiled to > verify). Here's a functional version that's about 70% as fast as my > "5": > > (defn gaussian-matrix6 [L] > (to-array (for [i (range L)] (into-array Double/TYPE (for [j > (range L)] (next-gaussian)))))) > > and I'd guess that's about as good as you're going to get, given that > this approach is necessarily going to box and unbox the doubles, and > create intermediate sequences, rather than stuffing the primitive > doubles directly into the result array. > > -Jason > > On Sep 20, 12:00 pm, Ranjit <rjcha...@gmail.com> wrote: > > > Replacing the doseq's with dotimes speeds it up a little more: > > > (defn gaussian-matrix5 [^"[[D" arr] > > (dotimes [x (alength arr)] > > (dotimes [y (alength (first arr))] > > (aset-double ^doubles (aget arr (int x)) (int y) (next- > > gaussian))))) > > > but I'm getting reflection warnings on alength. I guess it doesn't > > cause a problem because they're only called once? > > > Also adding type hints to the more functional version of my first > > attempt speeds things up quite a bit: > > > (defn gaussian-matrix2 [L] > > (into-array ^doubles > > (map double-array (partition L (repeatedly (* L L) next- > > gaussian))))) > > > But it's still about 4x slower than gaussian-matrix5 above. There must > > be a way to improve on the inner loop here that doesn't require using > > indices, right? > > > On Sep 20, 12:32 pm, Jason Wolfe <jawo...@berkeley.edu> wrote: > > > > Oops, I found aset-double2 with tab completion and figured it was > > > build-in. Forgot it was a utility I built some time ago, a stub for a > > > Java method that does the setting. > > > > Also, I got the type hint for the "arr" arg wrong, although it didn't > > > seem to matter. > > > > Here's a fixed version in standard Clojure that's basically as fast: > > > > user> (defn gaussian-matrix4 [^"[[D" arr ^int L] > > > (doseq [x (range L) y (range L)] (aset-double ^doubles > > > (aget arr (int x)) (int y) (.nextGaussian ^Random r)))) > > > #'user/gaussian-matrix4 > > > user> (do (microbench (gaussian-matrix3 (make-array Double/TYPE 10 > > > 10) 10)) (microbench (gaussian-matrix4 (make-array Double/TYPE 10 10) > > > 10)) ) > > > min; avg; max ms: 0.000 ; 0.033 ; 8.837 ( 56828 iterations) > > > min; avg; max ms: 0.009 ; 0.038 ; 7.132 ( 50579 iterations) > > > > It seems like you should be able to just use aset-double with multiple > > > indices (in place of aset-double2), but I can't seem to get the type > > > hints right. > > > > -Jason > > > > On Sep 20, 7:36 am, Ranjit <rjcha...@gmail.com> wrote: > > > > > Thanks Jason, this is great. > > > > > I was confused earlier because I wasn't seeing reflection warnings, > > > > but it turns out that was only because I was evaluating the function > > > > definitions in the emacs buffer, and the warnings weren't visible. > > > > > I have a question about gaussian-matrix3 though. What is "aset- > > > > double2"? Is that a macro that has a type hint for an array of > > > > doubles? > > > > > Thanks, > > > > > -Ranjit > > > > On Sep 19, 5:37 pm, Jason Wolfe <jawo...@berkeley.edu> wrote: > > > > > > Hi Ranjit, > > > > > > The big perf differences you're seeing are due to reflective calls. > > > > > Getting the Java array bits properly type-hinted is especially tricky, > > > > > since you don't always get good reflection warnings. > > > > > > Note that aset is only fast for reference types: > > > > > > user> (doc aset) > > > > > ------------------------- > > > > > clojure.core/aset > > > > > ([array idx val] [array idx idx2 & idxv]) > > > > > Sets the value at the index/indices. Works on Java arrays of > > > > > reference types. Returns val. > > > > > > So, if you want to speed things up ... here's your starting point: > > > > > > user> (set! *warn-on-reflection* true) > > > > > true > > > > > user> (import java.util.Random) > > > > > (def r (Random. )) > > > > > > (defn next-gaussian [] (.nextGaussian r)) > > > > > > (defn gaussian-matrix1 [arr L] > > > > > (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian)))) > > > > > > (defn gaussian-matrix2 [L] > > > > > (into-array (map double-array (partition L (repeatedly (* L L) > > > > > next-gaussian))))) > > > > > > Reflection warning, NO_SOURCE_FILE:1 - reference to field nextGaussian > > > > > can't be resolved. > > > > > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > > > > > 10) 10)) (microbench (gaussian-matrix2 10)) ) > > > > > min; avg; max ms: 2.944 ; 4.693 ; 34.643 ( 424 iterations) > > > > > min; avg; max ms: 0.346 ; 0.567 ; 11.006 ( 3491 iterations) > > > > > > ;; Now, we can get rid of the reflection in next-guassian: > > > > > > user> (defn next-gaussian [] (.nextGaussian #^Random r)) > > > > > #'user/next-gaussian > > > > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > > > > > 10) 10)) (microbench (gaussian-matrix2 10)) ) > > > > > min; avg; max ms: 2.639 ; 4.194 ; 25.024 ( 475 iterations) > > > > > min; avg; max ms: 0.068 ; 0.130 ; 10.766 ( 15104 iterations) > > > > > nil > > > > > > ;; which has cut out the main bottleneck in gaussian-matrix2. > > > > > ;; 1 is still slow because of its array handling. > > > > > ;; here's a fixed version: > > > > > > user> (defn gaussian-matrix3 [^doubles arr ^int L] > > > > > (doseq [x (range L) y (range L)] (aset-double2 arr (int x) (int > > > > > y) (.nextGaussian ^Random r)))) > > > > > #'user/gaussian-matrix3 > > > > > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > > > > > 10) 10)) (microbench (gaussian-matrix2 10)) (microbench (gaussian- > > > > > matrix3 (make-array Double/TYPE 10 10) 10)) ) > > > > > min; avg; max ms: 2.656 ; 4.164 ; 12.752 ( 479 iterations) > > > > > min; avg; max ms: 0.065 ; 0.128 ; 9.712 ( 15255 iterations) > > > > > min; avg; max ms: 0.000 ; 0.035 ; 10.180 ( 54618 iterations) > > > > > nil > > > > > > ;; which is 100x faster than where we started. > > > > > > A profiler is often a great way to figure out what's eating up time. > > > > > Personally, I've never found the need to use a disassembler. > > > > > > Cheers, Jason > > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en