I'm glad you think partition is the problem, because that was my guess
too. But I think I have the answer. This is the fastest version I've
seen so far:

(defn gaussian-matrix-final [L]
  (into-array ^doubles (map double-array (repeat L (repeatedly L next-
gaussian)))))

If I understand what's going on now, then it looks like the only way
to make this any faster is if next-gaussian could return primitives.

The for, and doseq macros seems like they're pretty slow.

-Ranjit



On Sep 20, 3:30 pm, Jason Wolfe <jawo...@berkeley.edu> wrote:
> I think partition is slowing you down (but haven't profiled to
> verify).  Here's a functional version that's about 70% as fast as my
> "5":
>
> (defn gaussian-matrix6 [L]
>      (to-array (for [i (range L)] (into-array Double/TYPE (for [j
> (range L)] (next-gaussian))))))
>
> and I'd guess that's about as good as you're going to get, given that
> this approach is necessarily going to box and unbox the doubles, and
> create intermediate sequences, rather than stuffing the primitive
> doubles directly into the result array.
>
> -Jason
>
> On Sep 20, 12:00 pm, Ranjit <rjcha...@gmail.com> wrote:
>
> > Replacing the doseq's with dotimes speeds it up a little more:
>
> > (defn gaussian-matrix5 [^"[[D" arr]
> >   (dotimes [x (alength arr)]
> >     (dotimes [y (alength (first arr))]
> >       (aset-double ^doubles (aget arr (int x)) (int y) (next-
> > gaussian)))))
>
> > but I'm getting reflection warnings on alength. I guess it doesn't
> > cause a problem because they're only called once?
>
> > Also adding type hints to the more functional version of my first
> > attempt speeds things up quite a bit:
>
> > (defn gaussian-matrix2 [L]
> >      (into-array ^doubles
> >           (map double-array (partition L (repeatedly (* L L) next-
> > gaussian)))))
>
> > But it's still about 4x slower than gaussian-matrix5 above. There must
> > be a way to improve on the inner loop here that doesn't require using
> > indices, right?
>
> > On Sep 20, 12:32 pm, Jason Wolfe <jawo...@berkeley.edu> wrote:
>
> > > Oops, I found aset-double2 with tab completion and figured it was
> > > build-in.  Forgot it was a utility I built some time ago, a stub for a
> > > Java method that does the setting.
>
> > > Also, I got the type hint for the "arr" arg wrong, although it didn't
> > > seem to matter.
>
> > > Here's a fixed version in standard Clojure that's basically as fast:
>
> > > user> (defn gaussian-matrix4 [^"[[D" arr ^int L]
> > >             (doseq [x (range L) y (range L)] (aset-double ^doubles
> > > (aget arr (int x)) (int y) (.nextGaussian ^Random r))))
> > > #'user/gaussian-matrix4
> > > user> (do   (microbench (gaussian-matrix3 (make-array Double/TYPE 10
> > > 10) 10)) (microbench (gaussian-matrix4 (make-array Double/TYPE 10 10)
> > > 10)) )
> > > min; avg; max ms:  0.000 ; 0.033 ; 8.837    ( 56828  iterations)
> > > min; avg; max ms:  0.009 ; 0.038 ; 7.132    ( 50579  iterations)
>
> > > It seems like you should be able to just use aset-double with multiple
> > > indices (in place of aset-double2), but I can't seem to get the type
> > > hints right.
>
> > > -Jason
>
> > > On Sep 20, 7:36 am, Ranjit <rjcha...@gmail.com> wrote:
>
> > > > Thanks Jason, this is great.
>
> > > > I was confused earlier because I wasn't seeing reflection warnings,
> > > > but it turns out that was only because I was evaluating the function
> > > > definitions in the emacs buffer, and the warnings weren't visible.
>
> > > > I have a question about gaussian-matrix3 though. What is "aset-
> > > > double2"? Is that a macro that has a type hint for an array of
> > > > doubles?
>
> > > > Thanks,
>
> > > > -Ranjit
> > > > On Sep 19, 5:37 pm, Jason Wolfe <jawo...@berkeley.edu> wrote:
>
> > > > > Hi Ranjit,
>
> > > > > The big perf differences you're seeing are due to reflective calls.
> > > > > Getting the Java array bits properly type-hinted is especially tricky,
> > > > > since you don't always get good reflection warnings.
>
> > > > > Note that aset is only fast for reference types:
>
> > > > > user> (doc aset)
> > > > > -------------------------
> > > > > clojure.core/aset
> > > > > ([array idx val] [array idx idx2 & idxv])
> > > > >   Sets the value at the index/indices. Works on Java arrays of
> > > > >   reference types. Returns val.
>
> > > > > So, if you want to speed things up ... here's your starting point:
>
> > > > > user> (set! *warn-on-reflection* true)
> > > > > true
> > > > > user> (import java.util.Random)
> > > > > (def r (Random. ))
>
> > > > > (defn next-gaussian [] (.nextGaussian r))
>
> > > > > (defn gaussian-matrix1 [arr L]
> > > > >      (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian))))
>
> > > > > (defn gaussian-matrix2 [L]
> > > > >      (into-array (map double-array (partition L (repeatedly (* L L)
> > > > > next-gaussian)))))
>
> > > > > Reflection warning, NO_SOURCE_FILE:1 - reference to field nextGaussian
> > > > > can't be resolved.
>
> > > > > user> (do  (microbench (gaussian-matrix1 (make-array Double/TYPE 10
> > > > > 10) 10)) (microbench (gaussian-matrix2  10)) )
> > > > > min; avg; max ms:  2.944 ; 4.693 ; 34.643    ( 424  iterations)
> > > > > min; avg; max ms:  0.346 ; 0.567 ; 11.006    ( 3491  iterations)
>
> > > > > ;; Now, we can get rid of the reflection in next-guassian:
>
> > > > > user> (defn next-gaussian [] (.nextGaussian #^Random r))
> > > > > #'user/next-gaussian
> > > > > user> (do  (microbench (gaussian-matrix1 (make-array Double/TYPE 10
> > > > > 10) 10)) (microbench (gaussian-matrix2  10)) )
> > > > > min; avg; max ms:  2.639 ; 4.194 ; 25.024    ( 475  iterations)
> > > > > min; avg; max ms:  0.068 ; 0.130 ; 10.766    ( 15104  iterations)
> > > > > nil
>
> > > > > ;; which has cut out the main bottleneck in gaussian-matrix2.
> > > > > ;; 1 is still slow because of its array handling.
> > > > > ;; here's a fixed version:
>
> > > > > user> (defn gaussian-matrix3 [^doubles arr ^int L]
> > > > >      (doseq [x (range L) y (range L)] (aset-double2 arr (int x) (int
> > > > > y) (.nextGaussian ^Random r))))
> > > > > #'user/gaussian-matrix3
>
> > > > > user> (do  (microbench (gaussian-matrix1 (make-array Double/TYPE 10
> > > > > 10) 10)) (microbench (gaussian-matrix2  10)) (microbench (gaussian-
> > > > > matrix3 (make-array Double/TYPE 10 10) 10)) )
> > > > > min; avg; max ms:  2.656 ; 4.164 ; 12.752    ( 479  iterations)
> > > > > min; avg; max ms:  0.065 ; 0.128 ; 9.712    ( 15255  iterations)
> > > > > min; avg; max ms:  0.000 ; 0.035 ; 10.180    ( 54618  iterations)
> > > > > nil
>
> > > > > ;; which is 100x faster than where we started.
>
> > > > > A profiler is often a great way to figure out what's eating up time.
> > > > > Personally, I've never found the need to use a disassembler.
>
> > > > > Cheers, Jason
>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to