Re: why the big difference in speed?
I was missing type hints on the inner calls to aget in the sum, and changing from aset-double to aset makes it even faster: (defn sum-fields4 [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result] (let [L (int (alength arr1))] (dotimes [i L] (dotimes [j L] (aset ^doubles (aget result i) j (+ (aget ^doubles (aget arr1 i) j) (aget ^doubles (aget arr2 i) j))) It doesn't look like the casts to double aren't necessary though. On my computer on a 1000x1000 array this takes ~30 msec! But strangely replacing the call to aset-double with aset in our fastest gaussian-matrix slows things down a lot. Everything else seems to make sense though. On Sep 21, 11:40 pm, Jason Wolfe wrote: > This one is more than 10x faster > > (defn sum-fields3 [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result] > (let [L (int (alength arr1))] > (dotimes [i L] > (let [^doubles a1 (aget arr1 i) > ^doubles a2 (aget arr2 i) > ^doubles r (aget result i)] > (dotimes [j L] > (aset-double r j (+ (double (aget a1 j)) (double (aget a2 > j) > > Not sure what the problem was exactly with yours. Maybe the multi- > indices versions of aget. > > On Sep 21, 7:43 pm, Ranjit wrote: > > > I was thinking I needed the Java arrays for interop. At one step in my > > simulation I need to take an FFT of a 2d array, then multiply that > > array with another array, and then inverse FFT. Java array operations > > in Clojure seem so slow though that it will actually be better to > > convert from either a built in data structure or an incanter matrix to > > an array for the fft, then back for the multiplication, then back for > > the IFFT, and then back. > > > The incanter matrices seem like a pretty good choice. This function > > using incanter is about as fast as the imperative type hinted function > > you came up with: > > > (defn gaussian-matrix [L mean std] (matrix (map #(sample-normal %1 > > mean std) (repeat L L > > > ,and adding matrices is fast. Converting from an incanter matrix to an > > array like this: > > > (def B (into-array (map double-array (to-list A > > > takes ~100 msec, and converting back takes a similar amount of time. > > So unfortunately all the converting back and forth means almost a half > > second extra per loop. > > > Not that I want to use this approach anymore, but what other type > > hints could I add to my Java array addition function? The only > > unhinted variables I see left are the indices. > > > Thanks for all your help, > > > -Ranjit > > > On Sep 21, 5:48 pm, Jason Wolfe wrote: > > > > I think you're still missing some type hints. I think there are > > > varying degrees of reflective code the compiler can emit, and it > > > doesn't always warn for the intermediate cases. > > > > Do you need to do all this array business for Java interop, or just > > > because you believe it will give you maximum performance? If the > > > latter, I'd recommend trying it with built-in data structures first -- > > > you might be surprised at how good the performance can be. Or, if you > > > want to do lots of matrix-and-vector stuff, perhaps try out the > > > Incanter matrix library or similar Java libraries. > > > > I find writing correctly-hinted Clojure code for dealing with arrays > > > to be clumsier and more difficult than just writing the equivalent > > > Java (one of very few areas where this is the case), but fortunately I > > > only very rarely find the need to do so. But, if that's really what > > > you want to do, I think you should always be able to get speed on par > > > with raw Java. > > > > -Jason > > > > On Sep 21, 1:04 pm, Ranjit wrote: > > > > > Thanks, I just tried using the random number generator in Incanter as > > > > a replacement and it shaves another ~100 msec off the runtime of the > > > > function. That's better but still noticeably slower than python. > > > > > I also tried applying what I've learned here to writing a function > > > > that adds two arrays which is imperative and I used type hints to get > > > > rid of the reflection warnings: > > > > > (defn sum-fields [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result] > > > > (let [L (alength arr1)] > > > > (dotimes [i L] > > > > (dotimes [j L] > > > > (aset-double ^doubles (aget result i) j (+ (aget arr1 i j) > > > > (aget arr2 i j))) > > > > > but it's surprisingly slow compared to numpy again. > > > > > On Sep 21, 2:26 pm, Jason Wolfe wrote: > > > > > > FYI I fired up a profiler and more than 2/3 of the runtime of > > > > > gaussian- > > > > > matrix5 is going to the .nextGaussian method of java.util.Random. I > > > > > think there are faster (and higher-quality) drop-in replacements for > > > > > java.util.Random, if that is really important to you. I seem to > > > > > recall there being a good Mersenne Twister implementation around, > > > > > which might fit your bill. > > > > > > -Jason > > > > > > On Sep 21, 6:34 am, R
Re: why the big difference in speed?
This one is more than 10x faster (defn sum-fields3 [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result] (let [L (int (alength arr1))] (dotimes [i L] (let [^doubles a1 (aget arr1 i) ^doubles a2 (aget arr2 i) ^doubles r (aget result i)] (dotimes [j L] (aset-double r j (+ (double (aget a1 j)) (double (aget a2 j) Not sure what the problem was exactly with yours. Maybe the multi- indices versions of aget. On Sep 21, 7:43 pm, Ranjit wrote: > I was thinking I needed the Java arrays for interop. At one step in my > simulation I need to take an FFT of a 2d array, then multiply that > array with another array, and then inverse FFT. Java array operations > in Clojure seem so slow though that it will actually be better to > convert from either a built in data structure or an incanter matrix to > an array for the fft, then back for the multiplication, then back for > the IFFT, and then back. > > The incanter matrices seem like a pretty good choice. This function > using incanter is about as fast as the imperative type hinted function > you came up with: > > (defn gaussian-matrix [L mean std] (matrix (map #(sample-normal %1 > mean std) (repeat L L > > ,and adding matrices is fast. Converting from an incanter matrix to an > array like this: > > (def B (into-array (map double-array (to-list A > > takes ~100 msec, and converting back takes a similar amount of time. > So unfortunately all the converting back and forth means almost a half > second extra per loop. > > Not that I want to use this approach anymore, but what other type > hints could I add to my Java array addition function? The only > unhinted variables I see left are the indices. > > Thanks for all your help, > > -Ranjit > > On Sep 21, 5:48 pm, Jason Wolfe wrote: > > > I think you're still missing some type hints. I think there are > > varying degrees of reflective code the compiler can emit, and it > > doesn't always warn for the intermediate cases. > > > Do you need to do all this array business for Java interop, or just > > because you believe it will give you maximum performance? If the > > latter, I'd recommend trying it with built-in data structures first -- > > you might be surprised at how good the performance can be. Or, if you > > want to do lots of matrix-and-vector stuff, perhaps try out the > > Incanter matrix library or similar Java libraries. > > > I find writing correctly-hinted Clojure code for dealing with arrays > > to be clumsier and more difficult than just writing the equivalent > > Java (one of very few areas where this is the case), but fortunately I > > only very rarely find the need to do so. But, if that's really what > > you want to do, I think you should always be able to get speed on par > > with raw Java. > > > -Jason > > > On Sep 21, 1:04 pm, Ranjit wrote: > > > > Thanks, I just tried using the random number generator in Incanter as > > > a replacement and it shaves another ~100 msec off the runtime of the > > > function. That's better but still noticeably slower than python. > > > > I also tried applying what I've learned here to writing a function > > > that adds two arrays which is imperative and I used type hints to get > > > rid of the reflection warnings: > > > > (defn sum-fields [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result] > > > (let [L (alength arr1)] > > > (dotimes [i L] > > > (dotimes [j L] > > > (aset-double ^doubles (aget result i) j (+ (aget arr1 i j) > > > (aget arr2 i j))) > > > > but it's surprisingly slow compared to numpy again. > > > > On Sep 21, 2:26 pm, Jason Wolfe wrote: > > > > > FYI I fired up a profiler and more than 2/3 of the runtime of gaussian- > > > > matrix5 is going to the .nextGaussian method of java.util.Random. I > > > > think there are faster (and higher-quality) drop-in replacements for > > > > java.util.Random, if that is really important to you. I seem to > > > > recall there being a good Mersenne Twister implementation around, > > > > which might fit your bill. > > > > > -Jason > > > > > On Sep 21, 6:34 am, Ranjit wrote: > > > > > > Yeah, I spoke too soon. All the rows are identical. This is what I > > > > > meant to do: > > > > > > (defn gaussian-matrix-for-the-nth-time [L] > > > > > (into-array (map double-array (repeatedly L #(repeatedly L next- > > > > > gaussian) > > > > > > and on my computer takes ~800 msecs vs. ~400 msecs for gaussian- > > > > > matrix5 to make an array of 1000^2 gaussians. > > > > > > But numpy only takes about 100 msecs to do the same thing on my > > > > > computer. I'm surprised we can't beat that or at least get close. But > > > > > maybe next-gaussian is the bottleneck as you say. > > > > > > On Sep 21, 12:20 am, Jason Wolfe wrote: > > > > > > > On Sep 20, 4:43 pm, Ranjit wrote: > > > > > > > > I'm glad you think partition is the problem, because that was my > > > > > > > guess > > > > > > > too. But I think I have the answer. This i
Re: why the big difference in speed?
I was thinking I needed the Java arrays for interop. At one step in my simulation I need to take an FFT of a 2d array, then multiply that array with another array, and then inverse FFT. Java array operations in Clojure seem so slow though that it will actually be better to convert from either a built in data structure or an incanter matrix to an array for the fft, then back for the multiplication, then back for the IFFT, and then back. The incanter matrices seem like a pretty good choice. This function using incanter is about as fast as the imperative type hinted function you came up with: (defn gaussian-matrix [L mean std] (matrix (map #(sample-normal %1 mean std) (repeat L L ,and adding matrices is fast. Converting from an incanter matrix to an array like this: (def B (into-array (map double-array (to-list A takes ~100 msec, and converting back takes a similar amount of time. So unfortunately all the converting back and forth means almost a half second extra per loop. Not that I want to use this approach anymore, but what other type hints could I add to my Java array addition function? The only unhinted variables I see left are the indices. Thanks for all your help, -Ranjit On Sep 21, 5:48 pm, Jason Wolfe wrote: > I think you're still missing some type hints. I think there are > varying degrees of reflective code the compiler can emit, and it > doesn't always warn for the intermediate cases. > > Do you need to do all this array business for Java interop, or just > because you believe it will give you maximum performance? If the > latter, I'd recommend trying it with built-in data structures first -- > you might be surprised at how good the performance can be. Or, if you > want to do lots of matrix-and-vector stuff, perhaps try out the > Incanter matrix library or similar Java libraries. > > I find writing correctly-hinted Clojure code for dealing with arrays > to be clumsier and more difficult than just writing the equivalent > Java (one of very few areas where this is the case), but fortunately I > only very rarely find the need to do so. But, if that's really what > you want to do, I think you should always be able to get speed on par > with raw Java. > > -Jason > > On Sep 21, 1:04 pm, Ranjit wrote: > > > Thanks, I just tried using the random number generator in Incanter as > > a replacement and it shaves another ~100 msec off the runtime of the > > function. That's better but still noticeably slower than python. > > > I also tried applying what I've learned here to writing a function > > that adds two arrays which is imperative and I used type hints to get > > rid of the reflection warnings: > > > (defn sum-fields [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result] > > (let [L (alength arr1)] > > (dotimes [i L] > > (dotimes [j L] > > (aset-double ^doubles (aget result i) j (+ (aget arr1 i j) > > (aget arr2 i j))) > > > but it's surprisingly slow compared to numpy again. > > > On Sep 21, 2:26 pm, Jason Wolfe wrote: > > > > FYI I fired up a profiler and more than 2/3 of the runtime of gaussian- > > > matrix5 is going to the .nextGaussian method of java.util.Random. I > > > think there are faster (and higher-quality) drop-in replacements for > > > java.util.Random, if that is really important to you. I seem to > > > recall there being a good Mersenne Twister implementation around, > > > which might fit your bill. > > > > -Jason > > > > On Sep 21, 6:34 am, Ranjit wrote: > > > > > Yeah, I spoke too soon. All the rows are identical. This is what I > > > > meant to do: > > > > > (defn gaussian-matrix-for-the-nth-time [L] > > > > (into-array (map double-array (repeatedly L #(repeatedly L next- > > > > gaussian) > > > > > and on my computer takes ~800 msecs vs. ~400 msecs for gaussian- > > > > matrix5 to make an array of 1000^2 gaussians. > > > > > But numpy only takes about 100 msecs to do the same thing on my > > > > computer. I'm surprised we can't beat that or at least get close. But > > > > maybe next-gaussian is the bottleneck as you say. > > > > > On Sep 21, 12:20 am, Jason Wolfe wrote: > > > > > > On Sep 20, 4:43 pm, Ranjit wrote: > > > > > > > I'm glad you think partition is the problem, because that was my > > > > > > guess > > > > > > too. But I think I have the answer. This is the fastest version I've > > > > > > seen so far: > > > > > > > (defn gaussian-matrix-final [L] > > > > > > (into-array ^doubles (map double-array (repeat L (repeatedly L > > > > > > next- > > > > > > gaussian) > > > > > > The ^doubles type hint is wrong (it means array of doubles, not seq of > > > > > arrays of doubles); the compiler is just ignoring it. > > > > > > And the reason it's so fast is probably since you're repeating the > > > > > same L gaussian values L times here. Doesn't matter how fast it runs > > > > > if it returns the wrong answer :-). Anyway, that might indicate that > > > > > the .nextGaussian is actually the bottlenec
Re: why the big difference in speed?
I think you're still missing some type hints. I think there are varying degrees of reflective code the compiler can emit, and it doesn't always warn for the intermediate cases. Do you need to do all this array business for Java interop, or just because you believe it will give you maximum performance? If the latter, I'd recommend trying it with built-in data structures first -- you might be surprised at how good the performance can be. Or, if you want to do lots of matrix-and-vector stuff, perhaps try out the Incanter matrix library or similar Java libraries. I find writing correctly-hinted Clojure code for dealing with arrays to be clumsier and more difficult than just writing the equivalent Java (one of very few areas where this is the case), but fortunately I only very rarely find the need to do so. But, if that's really what you want to do, I think you should always be able to get speed on par with raw Java. -Jason On Sep 21, 1:04 pm, Ranjit wrote: > Thanks, I just tried using the random number generator in Incanter as > a replacement and it shaves another ~100 msec off the runtime of the > function. That's better but still noticeably slower than python. > > I also tried applying what I've learned here to writing a function > that adds two arrays which is imperative and I used type hints to get > rid of the reflection warnings: > > (defn sum-fields [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result] > (let [L (alength arr1)] > (dotimes [i L] > (dotimes [j L] > (aset-double ^doubles (aget result i) j (+ (aget arr1 i j) > (aget arr2 i j))) > > but it's surprisingly slow compared to numpy again. > > On Sep 21, 2:26 pm, Jason Wolfe wrote: > > > FYI I fired up a profiler and more than 2/3 of the runtime of gaussian- > > matrix5 is going to the .nextGaussian method of java.util.Random. I > > think there are faster (and higher-quality) drop-in replacements for > > java.util.Random, if that is really important to you. I seem to > > recall there being a good Mersenne Twister implementation around, > > which might fit your bill. > > > -Jason > > > On Sep 21, 6:34 am, Ranjit wrote: > > > > Yeah, I spoke too soon. All the rows are identical. This is what I > > > meant to do: > > > > (defn gaussian-matrix-for-the-nth-time [L] > > > (into-array (map double-array (repeatedly L #(repeatedly L next- > > > gaussian) > > > > and on my computer takes ~800 msecs vs. ~400 msecs for gaussian- > > > matrix5 to make an array of 1000^2 gaussians. > > > > But numpy only takes about 100 msecs to do the same thing on my > > > computer. I'm surprised we can't beat that or at least get close. But > > > maybe next-gaussian is the bottleneck as you say. > > > > On Sep 21, 12:20 am, Jason Wolfe wrote: > > > > > On Sep 20, 4:43 pm, Ranjit wrote: > > > > > > I'm glad you think partition is the problem, because that was my guess > > > > > too. But I think I have the answer. This is the fastest version I've > > > > > seen so far: > > > > > > (defn gaussian-matrix-final [L] > > > > > (into-array ^doubles (map double-array (repeat L (repeatedly L next- > > > > > gaussian) > > > > > The ^doubles type hint is wrong (it means array of doubles, not seq of > > > > arrays of doubles); the compiler is just ignoring it. > > > > > And the reason it's so fast is probably since you're repeating the > > > > same L gaussian values L times here. Doesn't matter how fast it runs > > > > if it returns the wrong answer :-). Anyway, that might indicate that > > > > the .nextGaussian is actually the bottleneck in the fastest versions. > > > > > -Jason > > > > > > If I understand what's going on now, then it looks like the only way > > > > > to make this any faster is if next-gaussian could return primitives. > > > > > > The for, and doseq macros seems like they're pretty slow. > > > > > > -Ranjit > > > > > > On Sep 20, 3:30 pm, Jason Wolfe wrote: > > > > > > > I think partition is slowing you down (but haven't profiled to > > > > > > verify). Here's a functional version that's about 70% as fast as my > > > > > > "5": > > > > > > > (defn gaussian-matrix6 [L] > > > > > > (to-array (for [i (range L)] (into-array Double/TYPE (for [j > > > > > > (range L)] (next-gaussian)) > > > > > > > and I'd guess that's about as good as you're going to get, given > > > > > > that > > > > > > this approach is necessarily going to box and unbox the doubles, and > > > > > > create intermediate sequences, rather than stuffing the primitive > > > > > > doubles directly into the result array. > > > > > > > -Jason > > > > > > > On Sep 20, 12:00 pm, Ranjit wrote: > > > > > > > > Replacing the doseq's with dotimes speeds it up a little more: > > > > > > > > (defn gaussian-matrix5 [^"[[D" arr] > > > > > > > (dotimes [x (alength arr)] > > > > > > > (dotimes [y (alength (first arr))] > > > > > > > (aset-double ^doubles (aget arr (int x)) (int y) (next- > > > > > > > gaussian) > > > > > > > >
Re: why the big difference in speed?
Thanks, I just tried using the random number generator in Incanter as a replacement and it shaves another ~100 msec off the runtime of the function. That's better but still noticeably slower than python. I also tried applying what I've learned here to writing a function that adds two arrays which is imperative and I used type hints to get rid of the reflection warnings: (defn sum-fields [^"[[D" arr1 ^"[[D" arr2 ^"[[D" result] (let [L (alength arr1)] (dotimes [i L] (dotimes [j L] (aset-double ^doubles (aget result i) j (+ (aget arr1 i j) (aget arr2 i j))) but it's surprisingly slow compared to numpy again. On Sep 21, 2:26 pm, Jason Wolfe wrote: > FYI I fired up a profiler and more than 2/3 of the runtime of gaussian- > matrix5 is going to the .nextGaussian method of java.util.Random. I > think there are faster (and higher-quality) drop-in replacements for > java.util.Random, if that is really important to you. I seem to > recall there being a good Mersenne Twister implementation around, > which might fit your bill. > > -Jason > > On Sep 21, 6:34 am, Ranjit wrote: > > > Yeah, I spoke too soon. All the rows are identical. This is what I > > meant to do: > > > (defn gaussian-matrix-for-the-nth-time [L] > > (into-array (map double-array (repeatedly L #(repeatedly L next- > > gaussian) > > > and on my computer takes ~800 msecs vs. ~400 msecs for gaussian- > > matrix5 to make an array of 1000^2 gaussians. > > > But numpy only takes about 100 msecs to do the same thing on my > > computer. I'm surprised we can't beat that or at least get close. But > > maybe next-gaussian is the bottleneck as you say. > > > On Sep 21, 12:20 am, Jason Wolfe wrote: > > > > On Sep 20, 4:43 pm, Ranjit wrote: > > > > > I'm glad you think partition is the problem, because that was my guess > > > > too. But I think I have the answer. This is the fastest version I've > > > > seen so far: > > > > > (defn gaussian-matrix-final [L] > > > > (into-array ^doubles (map double-array (repeat L (repeatedly L next- > > > > gaussian) > > > > The ^doubles type hint is wrong (it means array of doubles, not seq of > > > arrays of doubles); the compiler is just ignoring it. > > > > And the reason it's so fast is probably since you're repeating the > > > same L gaussian values L times here. Doesn't matter how fast it runs > > > if it returns the wrong answer :-). Anyway, that might indicate that > > > the .nextGaussian is actually the bottleneck in the fastest versions. > > > > -Jason > > > > > If I understand what's going on now, then it looks like the only way > > > > to make this any faster is if next-gaussian could return primitives. > > > > > The for, and doseq macros seems like they're pretty slow. > > > > > -Ranjit > > > > > On Sep 20, 3:30 pm, Jason Wolfe wrote: > > > > > > I think partition is slowing you down (but haven't profiled to > > > > > verify). Here's a functional version that's about 70% as fast as my > > > > > "5": > > > > > > (defn gaussian-matrix6 [L] > > > > > (to-array (for [i (range L)] (into-array Double/TYPE (for [j > > > > > (range L)] (next-gaussian)) > > > > > > and I'd guess that's about as good as you're going to get, given that > > > > > this approach is necessarily going to box and unbox the doubles, and > > > > > create intermediate sequences, rather than stuffing the primitive > > > > > doubles directly into the result array. > > > > > > -Jason > > > > > > On Sep 20, 12:00 pm, Ranjit wrote: > > > > > > > Replacing the doseq's with dotimes speeds it up a little more: > > > > > > > (defn gaussian-matrix5 [^"[[D" arr] > > > > > > (dotimes [x (alength arr)] > > > > > > (dotimes [y (alength (first arr))] > > > > > > (aset-double ^doubles (aget arr (int x)) (int y) (next- > > > > > > gaussian) > > > > > > > but I'm getting reflection warnings on alength. I guess it doesn't > > > > > > cause a problem because they're only called once? > > > > > > > Also adding type hints to the more functional version of my first > > > > > > attempt speeds things up quite a bit: > > > > > > > (defn gaussian-matrix2 [L] > > > > > > (into-array ^doubles > > > > > > (map double-array (partition L (repeatedly (* L L) next- > > > > > > gaussian) > > > > > > > But it's still about 4x slower than gaussian-matrix5 above. There > > > > > > must > > > > > > be a way to improve on the inner loop here that doesn't require > > > > > > using > > > > > > indices, right? > > > > > > > On Sep 20, 12:32 pm, Jason Wolfe wrote: > > > > > > > > Oops, I found aset-double2 with tab completion and figured it was > > > > > > > build-in. Forgot it was a utility I built some time ago, a stub > > > > > > > for a > > > > > > > Java method that does the setting. > > > > > > > > Also, I got the type hint for the "arr" arg wrong, although it > > > > > > > didn't > > > > > > > seem to matter. > > > > > > > > Here's a fixed version in standard Cloj
Re: why the big difference in speed?
FYI I fired up a profiler and more than 2/3 of the runtime of gaussian- matrix5 is going to the .nextGaussian method of java.util.Random. I think there are faster (and higher-quality) drop-in replacements for java.util.Random, if that is really important to you. I seem to recall there being a good Mersenne Twister implementation around, which might fit your bill. -Jason On Sep 21, 6:34 am, Ranjit wrote: > Yeah, I spoke too soon. All the rows are identical. This is what I > meant to do: > > (defn gaussian-matrix-for-the-nth-time [L] > (into-array (map double-array (repeatedly L #(repeatedly L next- > gaussian) > > and on my computer takes ~800 msecs vs. ~400 msecs for gaussian- > matrix5 to make an array of 1000^2 gaussians. > > But numpy only takes about 100 msecs to do the same thing on my > computer. I'm surprised we can't beat that or at least get close. But > maybe next-gaussian is the bottleneck as you say. > > On Sep 21, 12:20 am, Jason Wolfe wrote: > > > On Sep 20, 4:43 pm, Ranjit wrote: > > > > I'm glad you think partition is the problem, because that was my guess > > > too. But I think I have the answer. This is the fastest version I've > > > seen so far: > > > > (defn gaussian-matrix-final [L] > > > (into-array ^doubles (map double-array (repeat L (repeatedly L next- > > > gaussian) > > > The ^doubles type hint is wrong (it means array of doubles, not seq of > > arrays of doubles); the compiler is just ignoring it. > > > And the reason it's so fast is probably since you're repeating the > > same L gaussian values L times here. Doesn't matter how fast it runs > > if it returns the wrong answer :-). Anyway, that might indicate that > > the .nextGaussian is actually the bottleneck in the fastest versions. > > > -Jason > > > > If I understand what's going on now, then it looks like the only way > > > to make this any faster is if next-gaussian could return primitives. > > > > The for, and doseq macros seems like they're pretty slow. > > > > -Ranjit > > > > On Sep 20, 3:30 pm, Jason Wolfe wrote: > > > > > I think partition is slowing you down (but haven't profiled to > > > > verify). Here's a functional version that's about 70% as fast as my > > > > "5": > > > > > (defn gaussian-matrix6 [L] > > > > (to-array (for [i (range L)] (into-array Double/TYPE (for [j > > > > (range L)] (next-gaussian)) > > > > > and I'd guess that's about as good as you're going to get, given that > > > > this approach is necessarily going to box and unbox the doubles, and > > > > create intermediate sequences, rather than stuffing the primitive > > > > doubles directly into the result array. > > > > > -Jason > > > > > On Sep 20, 12:00 pm, Ranjit wrote: > > > > > > Replacing the doseq's with dotimes speeds it up a little more: > > > > > > (defn gaussian-matrix5 [^"[[D" arr] > > > > > (dotimes [x (alength arr)] > > > > > (dotimes [y (alength (first arr))] > > > > > (aset-double ^doubles (aget arr (int x)) (int y) (next- > > > > > gaussian) > > > > > > but I'm getting reflection warnings on alength. I guess it doesn't > > > > > cause a problem because they're only called once? > > > > > > Also adding type hints to the more functional version of my first > > > > > attempt speeds things up quite a bit: > > > > > > (defn gaussian-matrix2 [L] > > > > > (into-array ^doubles > > > > > (map double-array (partition L (repeatedly (* L L) next- > > > > > gaussian) > > > > > > But it's still about 4x slower than gaussian-matrix5 above. There must > > > > > be a way to improve on the inner loop here that doesn't require using > > > > > indices, right? > > > > > > On Sep 20, 12:32 pm, Jason Wolfe wrote: > > > > > > > Oops, I found aset-double2 with tab completion and figured it was > > > > > > build-in. Forgot it was a utility I built some time ago, a stub > > > > > > for a > > > > > > Java method that does the setting. > > > > > > > Also, I got the type hint for the "arr" arg wrong, although it > > > > > > didn't > > > > > > seem to matter. > > > > > > > Here's a fixed version in standard Clojure that's basically as fast: > > > > > > > user> (defn gaussian-matrix4 [^"[[D" arr ^int L] > > > > > > (doseq [x (range L) y (range L)] (aset-double ^doubles > > > > > > (aget arr (int x)) (int y) (.nextGaussian ^Random r > > > > > > #'user/gaussian-matrix4 > > > > > > user> (do (microbench (gaussian-matrix3 (make-array Double/TYPE 10 > > > > > > 10) 10)) (microbench (gaussian-matrix4 (make-array Double/TYPE 10 > > > > > > 10) > > > > > > 10)) ) > > > > > > min; avg; max ms: 0.000 ; 0.033 ; 8.837 ( 56828 iterations) > > > > > > min; avg; max ms: 0.009 ; 0.038 ; 7.132 ( 50579 iterations) > > > > > > > It seems like you should be able to just use aset-double with > > > > > > multiple > > > > > > indices (in place of aset-double2), but I can't seem to get the type > > > > > > hints right. > > > > > > > -Jason > > > > > > > On Sep 20, 7:36 am, Ranjit w
Re: why the big difference in speed?
Yeah, I spoke too soon. All the rows are identical. This is what I meant to do: (defn gaussian-matrix-for-the-nth-time [L] (into-array (map double-array (repeatedly L #(repeatedly L next- gaussian) and on my computer takes ~800 msecs vs. ~400 msecs for gaussian- matrix5 to make an array of 1000^2 gaussians. But numpy only takes about 100 msecs to do the same thing on my computer. I'm surprised we can't beat that or at least get close. But maybe next-gaussian is the bottleneck as you say. On Sep 21, 12:20 am, Jason Wolfe wrote: > On Sep 20, 4:43 pm, Ranjit wrote: > > > I'm glad you think partition is the problem, because that was my guess > > too. But I think I have the answer. This is the fastest version I've > > seen so far: > > > (defn gaussian-matrix-final [L] > > (into-array ^doubles (map double-array (repeat L (repeatedly L next- > > gaussian) > > The ^doubles type hint is wrong (it means array of doubles, not seq of > arrays of doubles); the compiler is just ignoring it. > > And the reason it's so fast is probably since you're repeating the > same L gaussian values L times here. Doesn't matter how fast it runs > if it returns the wrong answer :-). Anyway, that might indicate that > the .nextGaussian is actually the bottleneck in the fastest versions. > > -Jason > > > If I understand what's going on now, then it looks like the only way > > to make this any faster is if next-gaussian could return primitives. > > > The for, and doseq macros seems like they're pretty slow. > > > -Ranjit > > > On Sep 20, 3:30 pm, Jason Wolfe wrote: > > > > I think partition is slowing you down (but haven't profiled to > > > verify). Here's a functional version that's about 70% as fast as my > > > "5": > > > > (defn gaussian-matrix6 [L] > > > (to-array (for [i (range L)] (into-array Double/TYPE (for [j > > > (range L)] (next-gaussian)) > > > > and I'd guess that's about as good as you're going to get, given that > > > this approach is necessarily going to box and unbox the doubles, and > > > create intermediate sequences, rather than stuffing the primitive > > > doubles directly into the result array. > > > > -Jason > > > > On Sep 20, 12:00 pm, Ranjit wrote: > > > > > Replacing the doseq's with dotimes speeds it up a little more: > > > > > (defn gaussian-matrix5 [^"[[D" arr] > > > > (dotimes [x (alength arr)] > > > > (dotimes [y (alength (first arr))] > > > > (aset-double ^doubles (aget arr (int x)) (int y) (next- > > > > gaussian) > > > > > but I'm getting reflection warnings on alength. I guess it doesn't > > > > cause a problem because they're only called once? > > > > > Also adding type hints to the more functional version of my first > > > > attempt speeds things up quite a bit: > > > > > (defn gaussian-matrix2 [L] > > > > (into-array ^doubles > > > > (map double-array (partition L (repeatedly (* L L) next- > > > > gaussian) > > > > > But it's still about 4x slower than gaussian-matrix5 above. There must > > > > be a way to improve on the inner loop here that doesn't require using > > > > indices, right? > > > > > On Sep 20, 12:32 pm, Jason Wolfe wrote: > > > > > > Oops, I found aset-double2 with tab completion and figured it was > > > > > build-in. Forgot it was a utility I built some time ago, a stub for a > > > > > Java method that does the setting. > > > > > > Also, I got the type hint for the "arr" arg wrong, although it didn't > > > > > seem to matter. > > > > > > Here's a fixed version in standard Clojure that's basically as fast: > > > > > > user> (defn gaussian-matrix4 [^"[[D" arr ^int L] > > > > > (doseq [x (range L) y (range L)] (aset-double ^doubles > > > > > (aget arr (int x)) (int y) (.nextGaussian ^Random r > > > > > #'user/gaussian-matrix4 > > > > > user> (do (microbench (gaussian-matrix3 (make-array Double/TYPE 10 > > > > > 10) 10)) (microbench (gaussian-matrix4 (make-array Double/TYPE 10 10) > > > > > 10)) ) > > > > > min; avg; max ms: 0.000 ; 0.033 ; 8.837 ( 56828 iterations) > > > > > min; avg; max ms: 0.009 ; 0.038 ; 7.132 ( 50579 iterations) > > > > > > It seems like you should be able to just use aset-double with multiple > > > > > indices (in place of aset-double2), but I can't seem to get the type > > > > > hints right. > > > > > > -Jason > > > > > > On Sep 20, 7:36 am, Ranjit wrote: > > > > > > > Thanks Jason, this is great. > > > > > > > I was confused earlier because I wasn't seeing reflection warnings, > > > > > > but it turns out that was only because I was evaluating the function > > > > > > definitions in the emacs buffer, and the warnings weren't visible. > > > > > > > I have a question about gaussian-matrix3 though. What is "aset- > > > > > > double2"? Is that a macro that has a type hint for an array of > > > > > > doubles? > > > > > > > Thanks, > > > > > > > -Ranjit > > > > > > On Sep 19, 5:37 pm, Jason Wolfe wrote: > > > > > > > > Hi Ranjit, > > > > > > > > The big perf differences y
Re: why the big difference in speed?
On Sep 20, 4:43 pm, Ranjit wrote: > I'm glad you think partition is the problem, because that was my guess > too. But I think I have the answer. This is the fastest version I've > seen so far: > > (defn gaussian-matrix-final [L] > (into-array ^doubles (map double-array (repeat L (repeatedly L next- > gaussian) The ^doubles type hint is wrong (it means array of doubles, not seq of arrays of doubles); the compiler is just ignoring it. And the reason it's so fast is probably since you're repeating the same L gaussian values L times here. Doesn't matter how fast it runs if it returns the wrong answer :-). Anyway, that might indicate that the .nextGaussian is actually the bottleneck in the fastest versions. -Jason > If I understand what's going on now, then it looks like the only way > to make this any faster is if next-gaussian could return primitives. > > The for, and doseq macros seems like they're pretty slow. > > -Ranjit > > On Sep 20, 3:30 pm, Jason Wolfe wrote: > > > I think partition is slowing you down (but haven't profiled to > > verify). Here's a functional version that's about 70% as fast as my > > "5": > > > (defn gaussian-matrix6 [L] > > (to-array (for [i (range L)] (into-array Double/TYPE (for [j > > (range L)] (next-gaussian)) > > > and I'd guess that's about as good as you're going to get, given that > > this approach is necessarily going to box and unbox the doubles, and > > create intermediate sequences, rather than stuffing the primitive > > doubles directly into the result array. > > > -Jason > > > On Sep 20, 12:00 pm, Ranjit wrote: > > > > Replacing the doseq's with dotimes speeds it up a little more: > > > > (defn gaussian-matrix5 [^"[[D" arr] > > > (dotimes [x (alength arr)] > > > (dotimes [y (alength (first arr))] > > > (aset-double ^doubles (aget arr (int x)) (int y) (next- > > > gaussian) > > > > but I'm getting reflection warnings on alength. I guess it doesn't > > > cause a problem because they're only called once? > > > > Also adding type hints to the more functional version of my first > > > attempt speeds things up quite a bit: > > > > (defn gaussian-matrix2 [L] > > > (into-array ^doubles > > > (map double-array (partition L (repeatedly (* L L) next- > > > gaussian) > > > > But it's still about 4x slower than gaussian-matrix5 above. There must > > > be a way to improve on the inner loop here that doesn't require using > > > indices, right? > > > > On Sep 20, 12:32 pm, Jason Wolfe wrote: > > > > > Oops, I found aset-double2 with tab completion and figured it was > > > > build-in. Forgot it was a utility I built some time ago, a stub for a > > > > Java method that does the setting. > > > > > Also, I got the type hint for the "arr" arg wrong, although it didn't > > > > seem to matter. > > > > > Here's a fixed version in standard Clojure that's basically as fast: > > > > > user> (defn gaussian-matrix4 [^"[[D" arr ^int L] > > > > (doseq [x (range L) y (range L)] (aset-double ^doubles > > > > (aget arr (int x)) (int y) (.nextGaussian ^Random r > > > > #'user/gaussian-matrix4 > > > > user> (do (microbench (gaussian-matrix3 (make-array Double/TYPE 10 > > > > 10) 10)) (microbench (gaussian-matrix4 (make-array Double/TYPE 10 10) > > > > 10)) ) > > > > min; avg; max ms: 0.000 ; 0.033 ; 8.837 ( 56828 iterations) > > > > min; avg; max ms: 0.009 ; 0.038 ; 7.132 ( 50579 iterations) > > > > > It seems like you should be able to just use aset-double with multiple > > > > indices (in place of aset-double2), but I can't seem to get the type > > > > hints right. > > > > > -Jason > > > > > On Sep 20, 7:36 am, Ranjit wrote: > > > > > > Thanks Jason, this is great. > > > > > > I was confused earlier because I wasn't seeing reflection warnings, > > > > > but it turns out that was only because I was evaluating the function > > > > > definitions in the emacs buffer, and the warnings weren't visible. > > > > > > I have a question about gaussian-matrix3 though. What is "aset- > > > > > double2"? Is that a macro that has a type hint for an array of > > > > > doubles? > > > > > > Thanks, > > > > > > -Ranjit > > > > > On Sep 19, 5:37 pm, Jason Wolfe wrote: > > > > > > > Hi Ranjit, > > > > > > > The big perf differences you're seeing are due to reflective calls. > > > > > > Getting the Java array bits properly type-hinted is especially > > > > > > tricky, > > > > > > since you don't always get good reflection warnings. > > > > > > > Note that aset is only fast for reference types: > > > > > > > user> (doc aset) > > > > > > - > > > > > > clojure.core/aset > > > > > > ([array idx val] [array idx idx2 & idxv]) > > > > > > Sets the value at the index/indices. Works on Java arrays of > > > > > > reference types. Returns val. > > > > > > > So, if you want to speed things up ... here's your starting point: > > > > > > > user> (set! *warn-on-reflection* true) > > > > > > true > > > > > > user> (impo
Re: why the big difference in speed?
Actually it turns out the type hinting in gaussian-matrix-final isn't even necessary. I just took it out and the speed doesn't seem to change. On Sep 20, 7:43 pm, Ranjit wrote: > I'm glad you think partition is the problem, because that was my guess > too. But I think I have the answer. This is the fastest version I've > seen so far: > > (defn gaussian-matrix-final [L] > (into-array ^doubles (map double-array (repeat L (repeatedly L next- > gaussian) > > If I understand what's going on now, then it looks like the only way > to make this any faster is if next-gaussian could return primitives. > > The for, and doseq macros seems like they're pretty slow. > > -Ranjit > > On Sep 20, 3:30 pm, Jason Wolfe wrote: > > > I think partition is slowing you down (but haven't profiled to > > verify). Here's a functional version that's about 70% as fast as my > > "5": > > > (defn gaussian-matrix6 [L] > > (to-array (for [i (range L)] (into-array Double/TYPE (for [j > > (range L)] (next-gaussian)) > > > and I'd guess that's about as good as you're going to get, given that > > this approach is necessarily going to box and unbox the doubles, and > > create intermediate sequences, rather than stuffing the primitive > > doubles directly into the result array. > > > -Jason > > > On Sep 20, 12:00 pm, Ranjit wrote: > > > > Replacing the doseq's with dotimes speeds it up a little more: > > > > (defn gaussian-matrix5 [^"[[D" arr] > > > (dotimes [x (alength arr)] > > > (dotimes [y (alength (first arr))] > > > (aset-double ^doubles (aget arr (int x)) (int y) (next- > > > gaussian) > > > > but I'm getting reflection warnings on alength. I guess it doesn't > > > cause a problem because they're only called once? > > > > Also adding type hints to the more functional version of my first > > > attempt speeds things up quite a bit: > > > > (defn gaussian-matrix2 [L] > > > (into-array ^doubles > > > (map double-array (partition L (repeatedly (* L L) next- > > > gaussian) > > > > But it's still about 4x slower than gaussian-matrix5 above. There must > > > be a way to improve on the inner loop here that doesn't require using > > > indices, right? > > > > On Sep 20, 12:32 pm, Jason Wolfe wrote: > > > > > Oops, I found aset-double2 with tab completion and figured it was > > > > build-in. Forgot it was a utility I built some time ago, a stub for a > > > > Java method that does the setting. > > > > > Also, I got the type hint for the "arr" arg wrong, although it didn't > > > > seem to matter. > > > > > Here's a fixed version in standard Clojure that's basically as fast: > > > > > user> (defn gaussian-matrix4 [^"[[D" arr ^int L] > > > > (doseq [x (range L) y (range L)] (aset-double ^doubles > > > > (aget arr (int x)) (int y) (.nextGaussian ^Random r > > > > #'user/gaussian-matrix4 > > > > user> (do (microbench (gaussian-matrix3 (make-array Double/TYPE 10 > > > > 10) 10)) (microbench (gaussian-matrix4 (make-array Double/TYPE 10 10) > > > > 10)) ) > > > > min; avg; max ms: 0.000 ; 0.033 ; 8.837 ( 56828 iterations) > > > > min; avg; max ms: 0.009 ; 0.038 ; 7.132 ( 50579 iterations) > > > > > It seems like you should be able to just use aset-double with multiple > > > > indices (in place of aset-double2), but I can't seem to get the type > > > > hints right. > > > > > -Jason > > > > > On Sep 20, 7:36 am, Ranjit wrote: > > > > > > Thanks Jason, this is great. > > > > > > I was confused earlier because I wasn't seeing reflection warnings, > > > > > but it turns out that was only because I was evaluating the function > > > > > definitions in the emacs buffer, and the warnings weren't visible. > > > > > > I have a question about gaussian-matrix3 though. What is "aset- > > > > > double2"? Is that a macro that has a type hint for an array of > > > > > doubles? > > > > > > Thanks, > > > > > > -Ranjit > > > > > On Sep 19, 5:37 pm, Jason Wolfe wrote: > > > > > > > Hi Ranjit, > > > > > > > The big perf differences you're seeing are due to reflective calls. > > > > > > Getting the Java array bits properly type-hinted is especially > > > > > > tricky, > > > > > > since you don't always get good reflection warnings. > > > > > > > Note that aset is only fast for reference types: > > > > > > > user> (doc aset) > > > > > > - > > > > > > clojure.core/aset > > > > > > ([array idx val] [array idx idx2 & idxv]) > > > > > > Sets the value at the index/indices. Works on Java arrays of > > > > > > reference types. Returns val. > > > > > > > So, if you want to speed things up ... here's your starting point: > > > > > > > user> (set! *warn-on-reflection* true) > > > > > > true > > > > > > user> (import java.util.Random) > > > > > > (def r (Random. )) > > > > > > > (defn next-gaussian [] (.nextGaussian r)) > > > > > > > (defn gaussian-matrix1 [arr L] > > > > > > (doseq [x (range L) y (range L)] (aset arr x y > > > > > > (next-gaussian > > > > > > > (defn
Re: why the big difference in speed?
I'm glad you think partition is the problem, because that was my guess too. But I think I have the answer. This is the fastest version I've seen so far: (defn gaussian-matrix-final [L] (into-array ^doubles (map double-array (repeat L (repeatedly L next- gaussian) If I understand what's going on now, then it looks like the only way to make this any faster is if next-gaussian could return primitives. The for, and doseq macros seems like they're pretty slow. -Ranjit On Sep 20, 3:30 pm, Jason Wolfe wrote: > I think partition is slowing you down (but haven't profiled to > verify). Here's a functional version that's about 70% as fast as my > "5": > > (defn gaussian-matrix6 [L] > (to-array (for [i (range L)] (into-array Double/TYPE (for [j > (range L)] (next-gaussian)) > > and I'd guess that's about as good as you're going to get, given that > this approach is necessarily going to box and unbox the doubles, and > create intermediate sequences, rather than stuffing the primitive > doubles directly into the result array. > > -Jason > > On Sep 20, 12:00 pm, Ranjit wrote: > > > Replacing the doseq's with dotimes speeds it up a little more: > > > (defn gaussian-matrix5 [^"[[D" arr] > > (dotimes [x (alength arr)] > > (dotimes [y (alength (first arr))] > > (aset-double ^doubles (aget arr (int x)) (int y) (next- > > gaussian) > > > but I'm getting reflection warnings on alength. I guess it doesn't > > cause a problem because they're only called once? > > > Also adding type hints to the more functional version of my first > > attempt speeds things up quite a bit: > > > (defn gaussian-matrix2 [L] > > (into-array ^doubles > > (map double-array (partition L (repeatedly (* L L) next- > > gaussian) > > > But it's still about 4x slower than gaussian-matrix5 above. There must > > be a way to improve on the inner loop here that doesn't require using > > indices, right? > > > On Sep 20, 12:32 pm, Jason Wolfe wrote: > > > > Oops, I found aset-double2 with tab completion and figured it was > > > build-in. Forgot it was a utility I built some time ago, a stub for a > > > Java method that does the setting. > > > > Also, I got the type hint for the "arr" arg wrong, although it didn't > > > seem to matter. > > > > Here's a fixed version in standard Clojure that's basically as fast: > > > > user> (defn gaussian-matrix4 [^"[[D" arr ^int L] > > > (doseq [x (range L) y (range L)] (aset-double ^doubles > > > (aget arr (int x)) (int y) (.nextGaussian ^Random r > > > #'user/gaussian-matrix4 > > > user> (do (microbench (gaussian-matrix3 (make-array Double/TYPE 10 > > > 10) 10)) (microbench (gaussian-matrix4 (make-array Double/TYPE 10 10) > > > 10)) ) > > > min; avg; max ms: 0.000 ; 0.033 ; 8.837 ( 56828 iterations) > > > min; avg; max ms: 0.009 ; 0.038 ; 7.132 ( 50579 iterations) > > > > It seems like you should be able to just use aset-double with multiple > > > indices (in place of aset-double2), but I can't seem to get the type > > > hints right. > > > > -Jason > > > > On Sep 20, 7:36 am, Ranjit wrote: > > > > > Thanks Jason, this is great. > > > > > I was confused earlier because I wasn't seeing reflection warnings, > > > > but it turns out that was only because I was evaluating the function > > > > definitions in the emacs buffer, and the warnings weren't visible. > > > > > I have a question about gaussian-matrix3 though. What is "aset- > > > > double2"? Is that a macro that has a type hint for an array of > > > > doubles? > > > > > Thanks, > > > > > -Ranjit > > > > On Sep 19, 5:37 pm, Jason Wolfe wrote: > > > > > > Hi Ranjit, > > > > > > The big perf differences you're seeing are due to reflective calls. > > > > > Getting the Java array bits properly type-hinted is especially tricky, > > > > > since you don't always get good reflection warnings. > > > > > > Note that aset is only fast for reference types: > > > > > > user> (doc aset) > > > > > - > > > > > clojure.core/aset > > > > > ([array idx val] [array idx idx2 & idxv]) > > > > > Sets the value at the index/indices. Works on Java arrays of > > > > > reference types. Returns val. > > > > > > So, if you want to speed things up ... here's your starting point: > > > > > > user> (set! *warn-on-reflection* true) > > > > > true > > > > > user> (import java.util.Random) > > > > > (def r (Random. )) > > > > > > (defn next-gaussian [] (.nextGaussian r)) > > > > > > (defn gaussian-matrix1 [arr L] > > > > > (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian > > > > > > (defn gaussian-matrix2 [L] > > > > > (into-array (map double-array (partition L (repeatedly (* L L) > > > > > next-gaussian) > > > > > > Reflection warning, NO_SOURCE_FILE:1 - reference to field nextGaussian > > > > > can't be resolved. > > > > > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > > > > > 10) 10)) (microbench (gaussian-matrix2 10)) ) > > > > > min;
Re: why the big difference in speed?
I think partition is slowing you down (but haven't profiled to verify). Here's a functional version that's about 70% as fast as my "5": (defn gaussian-matrix6 [L] (to-array (for [i (range L)] (into-array Double/TYPE (for [j (range L)] (next-gaussian)) and I'd guess that's about as good as you're going to get, given that this approach is necessarily going to box and unbox the doubles, and create intermediate sequences, rather than stuffing the primitive doubles directly into the result array. -Jason On Sep 20, 12:00 pm, Ranjit wrote: > Replacing the doseq's with dotimes speeds it up a little more: > > (defn gaussian-matrix5 [^"[[D" arr] > (dotimes [x (alength arr)] > (dotimes [y (alength (first arr))] > (aset-double ^doubles (aget arr (int x)) (int y) (next- > gaussian) > > but I'm getting reflection warnings on alength. I guess it doesn't > cause a problem because they're only called once? > > Also adding type hints to the more functional version of my first > attempt speeds things up quite a bit: > > (defn gaussian-matrix2 [L] > (into-array ^doubles > (map double-array (partition L (repeatedly (* L L) next- > gaussian) > > But it's still about 4x slower than gaussian-matrix5 above. There must > be a way to improve on the inner loop here that doesn't require using > indices, right? > > On Sep 20, 12:32 pm, Jason Wolfe wrote: > > > Oops, I found aset-double2 with tab completion and figured it was > > build-in. Forgot it was a utility I built some time ago, a stub for a > > Java method that does the setting. > > > Also, I got the type hint for the "arr" arg wrong, although it didn't > > seem to matter. > > > Here's a fixed version in standard Clojure that's basically as fast: > > > user> (defn gaussian-matrix4 [^"[[D" arr ^int L] > > (doseq [x (range L) y (range L)] (aset-double ^doubles > > (aget arr (int x)) (int y) (.nextGaussian ^Random r > > #'user/gaussian-matrix4 > > user> (do (microbench (gaussian-matrix3 (make-array Double/TYPE 10 > > 10) 10)) (microbench (gaussian-matrix4 (make-array Double/TYPE 10 10) > > 10)) ) > > min; avg; max ms: 0.000 ; 0.033 ; 8.837 ( 56828 iterations) > > min; avg; max ms: 0.009 ; 0.038 ; 7.132 ( 50579 iterations) > > > It seems like you should be able to just use aset-double with multiple > > indices (in place of aset-double2), but I can't seem to get the type > > hints right. > > > -Jason > > > On Sep 20, 7:36 am, Ranjit wrote: > > > > Thanks Jason, this is great. > > > > I was confused earlier because I wasn't seeing reflection warnings, > > > but it turns out that was only because I was evaluating the function > > > definitions in the emacs buffer, and the warnings weren't visible. > > > > I have a question about gaussian-matrix3 though. What is "aset- > > > double2"? Is that a macro that has a type hint for an array of > > > doubles? > > > > Thanks, > > > > -Ranjit > > > On Sep 19, 5:37 pm, Jason Wolfe wrote: > > > > > Hi Ranjit, > > > > > The big perf differences you're seeing are due to reflective calls. > > > > Getting the Java array bits properly type-hinted is especially tricky, > > > > since you don't always get good reflection warnings. > > > > > Note that aset is only fast for reference types: > > > > > user> (doc aset) > > > > - > > > > clojure.core/aset > > > > ([array idx val] [array idx idx2 & idxv]) > > > > Sets the value at the index/indices. Works on Java arrays of > > > > reference types. Returns val. > > > > > So, if you want to speed things up ... here's your starting point: > > > > > user> (set! *warn-on-reflection* true) > > > > true > > > > user> (import java.util.Random) > > > > (def r (Random. )) > > > > > (defn next-gaussian [] (.nextGaussian r)) > > > > > (defn gaussian-matrix1 [arr L] > > > > (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian > > > > > (defn gaussian-matrix2 [L] > > > > (into-array (map double-array (partition L (repeatedly (* L L) > > > > next-gaussian) > > > > > Reflection warning, NO_SOURCE_FILE:1 - reference to field nextGaussian > > > > can't be resolved. > > > > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > > > > 10) 10)) (microbench (gaussian-matrix2 10)) ) > > > > min; avg; max ms: 2.944 ; 4.693 ; 34.643 ( 424 iterations) > > > > min; avg; max ms: 0.346 ; 0.567 ; 11.006 ( 3491 iterations) > > > > > ;; Now, we can get rid of the reflection in next-guassian: > > > > > user> (defn next-gaussian [] (.nextGaussian #^Random r)) > > > > #'user/next-gaussian > > > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > > > > 10) 10)) (microbench (gaussian-matrix2 10)) ) > > > > min; avg; max ms: 2.639 ; 4.194 ; 25.024 ( 475 iterations) > > > > min; avg; max ms: 0.068 ; 0.130 ; 10.766 ( 15104 iterations) > > > > nil > > > > > ;; which has cut out the main bottleneck in gaussian-matrix2. > > > > ;; 1 is still slow because of its
Re: why the big difference in speed?
Replacing the doseq's with dotimes speeds it up a little more: (defn gaussian-matrix5 [^"[[D" arr] (dotimes [x (alength arr)] (dotimes [y (alength (first arr))] (aset-double ^doubles (aget arr (int x)) (int y) (next- gaussian) but I'm getting reflection warnings on alength. I guess it doesn't cause a problem because they're only called once? Also adding type hints to the more functional version of my first attempt speeds things up quite a bit: (defn gaussian-matrix2 [L] (into-array ^doubles (map double-array (partition L (repeatedly (* L L) next- gaussian) But it's still about 4x slower than gaussian-matrix5 above. There must be a way to improve on the inner loop here that doesn't require using indices, right? On Sep 20, 12:32 pm, Jason Wolfe wrote: > Oops, I found aset-double2 with tab completion and figured it was > build-in. Forgot it was a utility I built some time ago, a stub for a > Java method that does the setting. > > Also, I got the type hint for the "arr" arg wrong, although it didn't > seem to matter. > > Here's a fixed version in standard Clojure that's basically as fast: > > user> (defn gaussian-matrix4 [^"[[D" arr ^int L] > (doseq [x (range L) y (range L)] (aset-double ^doubles > (aget arr (int x)) (int y) (.nextGaussian ^Random r > #'user/gaussian-matrix4 > user> (do (microbench (gaussian-matrix3 (make-array Double/TYPE 10 > 10) 10)) (microbench (gaussian-matrix4 (make-array Double/TYPE 10 10) > 10)) ) > min; avg; max ms: 0.000 ; 0.033 ; 8.837 ( 56828 iterations) > min; avg; max ms: 0.009 ; 0.038 ; 7.132 ( 50579 iterations) > > It seems like you should be able to just use aset-double with multiple > indices (in place of aset-double2), but I can't seem to get the type > hints right. > > -Jason > > On Sep 20, 7:36 am, Ranjit wrote: > > > Thanks Jason, this is great. > > > I was confused earlier because I wasn't seeing reflection warnings, > > but it turns out that was only because I was evaluating the function > > definitions in the emacs buffer, and the warnings weren't visible. > > > I have a question about gaussian-matrix3 though. What is "aset- > > double2"? Is that a macro that has a type hint for an array of > > doubles? > > > Thanks, > > > -Ranjit > > On Sep 19, 5:37 pm, Jason Wolfe wrote: > > > > Hi Ranjit, > > > > The big perf differences you're seeing are due to reflective calls. > > > Getting the Java array bits properly type-hinted is especially tricky, > > > since you don't always get good reflection warnings. > > > > Note that aset is only fast for reference types: > > > > user> (doc aset) > > > - > > > clojure.core/aset > > > ([array idx val] [array idx idx2 & idxv]) > > > Sets the value at the index/indices. Works on Java arrays of > > > reference types. Returns val. > > > > So, if you want to speed things up ... here's your starting point: > > > > user> (set! *warn-on-reflection* true) > > > true > > > user> (import java.util.Random) > > > (def r (Random. )) > > > > (defn next-gaussian [] (.nextGaussian r)) > > > > (defn gaussian-matrix1 [arr L] > > > (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian > > > > (defn gaussian-matrix2 [L] > > > (into-array (map double-array (partition L (repeatedly (* L L) > > > next-gaussian) > > > > Reflection warning, NO_SOURCE_FILE:1 - reference to field nextGaussian > > > can't be resolved. > > > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > > > 10) 10)) (microbench (gaussian-matrix2 10)) ) > > > min; avg; max ms: 2.944 ; 4.693 ; 34.643 ( 424 iterations) > > > min; avg; max ms: 0.346 ; 0.567 ; 11.006 ( 3491 iterations) > > > > ;; Now, we can get rid of the reflection in next-guassian: > > > > user> (defn next-gaussian [] (.nextGaussian #^Random r)) > > > #'user/next-gaussian > > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > > > 10) 10)) (microbench (gaussian-matrix2 10)) ) > > > min; avg; max ms: 2.639 ; 4.194 ; 25.024 ( 475 iterations) > > > min; avg; max ms: 0.068 ; 0.130 ; 10.766 ( 15104 iterations) > > > nil > > > > ;; which has cut out the main bottleneck in gaussian-matrix2. > > > ;; 1 is still slow because of its array handling. > > > ;; here's a fixed version: > > > > user> (defn gaussian-matrix3 [^doubles arr ^int L] > > > (doseq [x (range L) y (range L)] (aset-double2 arr (int x) (int > > > y) (.nextGaussian ^Random r > > > #'user/gaussian-matrix3 > > > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > > > 10) 10)) (microbench (gaussian-matrix2 10)) (microbench (gaussian- > > > matrix3 (make-array Double/TYPE 10 10) 10)) ) > > > min; avg; max ms: 2.656 ; 4.164 ; 12.752 ( 479 iterations) > > > min; avg; max ms: 0.065 ; 0.128 ; 9.712 ( 15255 iterations) > > > min; avg; max ms: 0.000 ; 0.035 ; 10.180 ( 54618 iterations) > > > nil > > > > ;; which is 100x faster than where we started. >
Re: why the big difference in speed?
Oops, I found aset-double2 with tab completion and figured it was build-in. Forgot it was a utility I built some time ago, a stub for a Java method that does the setting. Also, I got the type hint for the "arr" arg wrong, although it didn't seem to matter. Here's a fixed version in standard Clojure that's basically as fast: user> (defn gaussian-matrix4 [^"[[D" arr ^int L] (doseq [x (range L) y (range L)] (aset-double ^doubles (aget arr (int x)) (int y) (.nextGaussian ^Random r #'user/gaussian-matrix4 user> (do (microbench (gaussian-matrix3 (make-array Double/TYPE 10 10) 10)) (microbench (gaussian-matrix4 (make-array Double/TYPE 10 10) 10)) ) min; avg; max ms: 0.000 ; 0.033 ; 8.837( 56828 iterations) min; avg; max ms: 0.009 ; 0.038 ; 7.132( 50579 iterations) It seems like you should be able to just use aset-double with multiple indices (in place of aset-double2), but I can't seem to get the type hints right. -Jason On Sep 20, 7:36 am, Ranjit wrote: > Thanks Jason, this is great. > > I was confused earlier because I wasn't seeing reflection warnings, > but it turns out that was only because I was evaluating the function > definitions in the emacs buffer, and the warnings weren't visible. > > I have a question about gaussian-matrix3 though. What is "aset- > double2"? Is that a macro that has a type hint for an array of > doubles? > > Thanks, > > -Ranjit > On Sep 19, 5:37 pm, Jason Wolfe wrote: > > > Hi Ranjit, > > > The big perf differences you're seeing are due to reflective calls. > > Getting the Java array bits properly type-hinted is especially tricky, > > since you don't always get good reflection warnings. > > > Note that aset is only fast for reference types: > > > user> (doc aset) > > - > > clojure.core/aset > > ([array idx val] [array idx idx2 & idxv]) > > Sets the value at the index/indices. Works on Java arrays of > > reference types. Returns val. > > > So, if you want to speed things up ... here's your starting point: > > > user> (set! *warn-on-reflection* true) > > true > > user> (import java.util.Random) > > (def r (Random. )) > > > (defn next-gaussian [] (.nextGaussian r)) > > > (defn gaussian-matrix1 [arr L] > > (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian > > > (defn gaussian-matrix2 [L] > > (into-array (map double-array (partition L (repeatedly (* L L) > > next-gaussian) > > > Reflection warning, NO_SOURCE_FILE:1 - reference to field nextGaussian > > can't be resolved. > > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > > 10) 10)) (microbench (gaussian-matrix2 10)) ) > > min; avg; max ms: 2.944 ; 4.693 ; 34.643 ( 424 iterations) > > min; avg; max ms: 0.346 ; 0.567 ; 11.006 ( 3491 iterations) > > > ;; Now, we can get rid of the reflection in next-guassian: > > > user> (defn next-gaussian [] (.nextGaussian #^Random r)) > > #'user/next-gaussian > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > > 10) 10)) (microbench (gaussian-matrix2 10)) ) > > min; avg; max ms: 2.639 ; 4.194 ; 25.024 ( 475 iterations) > > min; avg; max ms: 0.068 ; 0.130 ; 10.766 ( 15104 iterations) > > nil > > > ;; which has cut out the main bottleneck in gaussian-matrix2. > > ;; 1 is still slow because of its array handling. > > ;; here's a fixed version: > > > user> (defn gaussian-matrix3 [^doubles arr ^int L] > > (doseq [x (range L) y (range L)] (aset-double2 arr (int x) (int > > y) (.nextGaussian ^Random r > > #'user/gaussian-matrix3 > > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > > 10) 10)) (microbench (gaussian-matrix2 10)) (microbench (gaussian- > > matrix3 (make-array Double/TYPE 10 10) 10)) ) > > min; avg; max ms: 2.656 ; 4.164 ; 12.752 ( 479 iterations) > > min; avg; max ms: 0.065 ; 0.128 ; 9.712 ( 15255 iterations) > > min; avg; max ms: 0.000 ; 0.035 ; 10.180 ( 54618 iterations) > > nil > > > ;; which is 100x faster than where we started. > > > A profiler is often a great way to figure out what's eating up time. > > Personally, I've never found the need to use a disassembler. > > > Cheers, Jason -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: why the big difference in speed?
Thanks Jason, this is great. I was confused earlier because I wasn't seeing reflection warnings, but it turns out that was only because I was evaluating the function definitions in the emacs buffer, and the warnings weren't visible. I have a question about gaussian-matrix3 though. What is "aset- double2"? Is that a macro that has a type hint for an array of doubles? Thanks, -Ranjit On Sep 19, 5:37 pm, Jason Wolfe wrote: > Hi Ranjit, > > The big perf differences you're seeing are due to reflective calls. > Getting the Java array bits properly type-hinted is especially tricky, > since you don't always get good reflection warnings. > > Note that aset is only fast for reference types: > > user> (doc aset) > - > clojure.core/aset > ([array idx val] [array idx idx2 & idxv]) > Sets the value at the index/indices. Works on Java arrays of > reference types. Returns val. > > So, if you want to speed things up ... here's your starting point: > > user> (set! *warn-on-reflection* true) > true > user> (import java.util.Random) > (def r (Random. )) > > (defn next-gaussian [] (.nextGaussian r)) > > (defn gaussian-matrix1 [arr L] > (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian > > (defn gaussian-matrix2 [L] > (into-array (map double-array (partition L (repeatedly (* L L) > next-gaussian) > > Reflection warning, NO_SOURCE_FILE:1 - reference to field nextGaussian > can't be resolved. > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > 10) 10)) (microbench (gaussian-matrix2 10)) ) > min; avg; max ms: 2.944 ; 4.693 ; 34.643 ( 424 iterations) > min; avg; max ms: 0.346 ; 0.567 ; 11.006 ( 3491 iterations) > > ;; Now, we can get rid of the reflection in next-guassian: > > user> (defn next-gaussian [] (.nextGaussian #^Random r)) > #'user/next-gaussian > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > 10) 10)) (microbench (gaussian-matrix2 10)) ) > min; avg; max ms: 2.639 ; 4.194 ; 25.024 ( 475 iterations) > min; avg; max ms: 0.068 ; 0.130 ; 10.766 ( 15104 iterations) > nil > > ;; which has cut out the main bottleneck in gaussian-matrix2. > ;; 1 is still slow because of its array handling. > ;; here's a fixed version: > > user> (defn gaussian-matrix3 [^doubles arr ^int L] > (doseq [x (range L) y (range L)] (aset-double2 arr (int x) (int > y) (.nextGaussian ^Random r > #'user/gaussian-matrix3 > > user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 > 10) 10)) (microbench (gaussian-matrix2 10)) (microbench (gaussian- > matrix3 (make-array Double/TYPE 10 10) 10)) ) > min; avg; max ms: 2.656 ; 4.164 ; 12.752 ( 479 iterations) > min; avg; max ms: 0.065 ; 0.128 ; 9.712 ( 15255 iterations) > min; avg; max ms: 0.000 ; 0.035 ; 10.180 ( 54618 iterations) > nil > > ;; which is 100x faster than where we started. > > A profiler is often a great way to figure out what's eating up time. > Personally, I've never found the need to use a disassembler. > > Cheers, Jason -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: why the big difference in speed?
> But is the way Clojure works so opaque we need to see byte codes? > I was hoping someone on the list would have some intuition about > how the expressions get implemented. In fact I hope it's pretty > close to what I would naively guess from just reading the code > as is. One of the advantages of lisp is that there is a very clear connection between the code and the mental model of how the code is implemented. With a little experience it is easy to infer what machine language code will be generated. Once that becomes second nature it is easy to see what optimizations can be made (e.g. when to use declare). This becomes useful in numeric examples where you can get vastly improved code by adding declarations. If you look at the generated machine code you can compare it directly to, say, a fortran implementation of the same code. This is also true of Java code if you look at the generated byte code. I'm sure that Rich has a very strong sense of what the machine is doing underneath his code, otherwise why would there be a 32-trie optimization? How could he know if the transactions were correct? How would he find where the real optimizations matter? I know that Java will perform certain run-time optimizations but I consider these "part of the hardware of the JVM". The real CPU does similar optimizations, out-of-order execution, integer pipelining, cache-fetching, branch optimization, micro-op reordering, register renaming, etc. You can't depend on them from machine to machine. However, given the same javac compiler you can depend on the generated code being the same. For some optimizations see www.hpjava.org/pcrc/doc/rochester/rochester.pdf The way to develop intuition about Clojure is to see what the JVM will do. If you program for a living there should be no magic. I would guess, not having seen your code, that the most likely problem is that your numbers are boxed and you're spending time unboxing/reboxing. See http://www.bestinclass.dk/index.clj/2010/03/functional-fluid-dynamics-in-clojure.html He shows a couple macros for aget! and aset! which might help. There was also a long thread about boxing based on Rich's fast numerics branch. I expect others will disagree about the magic, of course, and I'm not trying to start a flame-war. Some programmers I have worked with did not know what "byte codes" were and they were still able to generate code (curiously they tended to be "design pattern programmers" so there may be a connection). See http://dj-java-decompiler.software.informer.com/3.9 for a piece of software that can show the byte codes. There are many others. Tim Daly -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: why the big difference in speed?
Microbench is a little utility I wrote that executes a piece of code for awhile to warm up, then records timings for as many runs as will fit in a preset time limit and reports statistics. So the numbers you see are min/median/max milleseconds per run, and number of runs that fit in 3 seconds of real time. Now that clojure.contrib.profile exists, you could use that to similar effect. But if you want my microbench, it's here http://gist.github.com/587199 (not the cleanest implementation, but it works). -Jason On Sep 19, 3:48 pm, Mike K wrote: > What is the definition of microbench? -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: why the big difference in speed?
What is the definition of microbench? -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: why the big difference in speed?
Hi Ranjit, The big perf differences you're seeing are due to reflective calls. Getting the Java array bits properly type-hinted is especially tricky, since you don't always get good reflection warnings. Note that aset is only fast for reference types: user> (doc aset) - clojure.core/aset ([array idx val] [array idx idx2 & idxv]) Sets the value at the index/indices. Works on Java arrays of reference types. Returns val. So, if you want to speed things up ... here's your starting point: user> (set! *warn-on-reflection* true) true user> (import java.util.Random) (def r (Random. )) (defn next-gaussian [] (.nextGaussian r)) (defn gaussian-matrix1 [arr L] (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian (defn gaussian-matrix2 [L] (into-array (map double-array (partition L (repeatedly (* L L) next-gaussian) Reflection warning, NO_SOURCE_FILE:1 - reference to field nextGaussian can't be resolved. user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 10) 10)) (microbench (gaussian-matrix2 10)) ) min; avg; max ms: 2.944 ; 4.693 ; 34.643( 424 iterations) min; avg; max ms: 0.346 ; 0.567 ; 11.006( 3491 iterations) ;; Now, we can get rid of the reflection in next-guassian: user> (defn next-gaussian [] (.nextGaussian #^Random r)) #'user/next-gaussian user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 10) 10)) (microbench (gaussian-matrix2 10)) ) min; avg; max ms: 2.639 ; 4.194 ; 25.024( 475 iterations) min; avg; max ms: 0.068 ; 0.130 ; 10.766( 15104 iterations) nil ;; which has cut out the main bottleneck in gaussian-matrix2. ;; 1 is still slow because of its array handling. ;; here's a fixed version: user> (defn gaussian-matrix3 [^doubles arr ^int L] (doseq [x (range L) y (range L)] (aset-double2 arr (int x) (int y) (.nextGaussian ^Random r #'user/gaussian-matrix3 user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 10) 10)) (microbench (gaussian-matrix2 10)) (microbench (gaussian- matrix3 (make-array Double/TYPE 10 10) 10)) ) min; avg; max ms: 2.656 ; 4.164 ; 12.752( 479 iterations) min; avg; max ms: 0.065 ; 0.128 ; 9.712( 15255 iterations) min; avg; max ms: 0.000 ; 0.035 ; 10.180( 54618 iterations) nil ;; which is 100x faster than where we started. A profiler is often a great way to figure out what's eating up time. Personally, I've never found the need to use a disassembler. Cheers, Jason -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: why the big difference in speed?
Hi Ranjit, The big perf differences you're seeing are due to reflective calls. Getting the Java array bits properly type-hinted is especially tricky, since you don't always get good reflection warnings. Note that aset is only fast for reference types: user> (doc aset) - clojure.core/aset ([array idx val] [array idx idx2 & idxv]) Sets the value at the index/indices. Works on Java arrays of reference types. Returns val. So, if you want to speed things up ... here's your starting point: user> (set! *warn-on-reflection* true) true user> (import java.util.Random) (def r (Random. )) (defn next-gaussian [] (.nextGaussian r)) (defn gaussian-matrix1 [arr L] (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian (defn gaussian-matrix2 [L] (into-array (map double-array (partition L (repeatedly (* L L) next-gaussian) Reflection warning, NO_SOURCE_FILE:1 - reference to field nextGaussian can't be resolved. user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 10) 10)) (microbench (gaussian-matrix2 10)) ) min; avg; max ms: 2.944 ; 4.693 ; 34.643( 424 iterations) min; avg; max ms: 0.346 ; 0.567 ; 11.006( 3491 iterations) ;; Now, we can get rid of the reflection in next-guassian: user> (defn next-gaussian [] (.nextGaussian #^Random r)) #'user/next-gaussian user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 10) 10)) (microbench (gaussian-matrix2 10)) ) min; avg; max ms: 2.639 ; 4.194 ; 25.024( 475 iterations) min; avg; max ms: 0.068 ; 0.130 ; 10.766( 15104 iterations) nil ;; which has cut out the main bottleneck in gaussian-matrix2. ;; 1 is still slow because of its array handling. ;; here's a fixed version: user> (defn gaussian-matrix3 [^doubles arr ^int L] (doseq [x (range L) y (range L)] (aset-double2 arr (int x) (int y) (.nextGaussian ^Random r #'user/gaussian-matrix3 user> (do (microbench (gaussian-matrix1 (make-array Double/TYPE 10 10) 10)) (microbench (gaussian-matrix2 10)) (microbench (gaussian- matrix3 (make-array Double/TYPE 10 10) 10)) ) min; avg; max ms: 2.656 ; 4.164 ; 12.752( 479 iterations) min; avg; max ms: 0.065 ; 0.128 ; 9.712( 15255 iterations) min; avg; max ms: 0.000 ; 0.035 ; 10.180( 54618 iterations) nil ;; which is 100x faster than where we started. A profiler is often a great way to figure out what's eating up time. Personally, I've never found the need to use a disassembler. Cheers, Jason -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: why the big difference in speed?
I'm seeing a big difference in speed for each function run only once, so I guess any Hotspot optimization isn't happening right? But is the way Clojure works so opaque we need to see byte codes? I was hoping someone on the list would have some intuition about how the expressions get implemented. In fact I hope it's pretty close to what I would naively guess from just reading the code as is. So I think gaussian-matrix1 is basically an imperative style program. There's a couple of loops and a random number is generated and then assigned to a element of the 2d array. And I think that gaussian-matrix2 is first generating a nested set lists with all of the random numbers, then each list is copied into a 1d Java array of doubles by the map operation, and then into-array makes a 1d array of double[]'s. Is that about right, or is that too naive somehow? On Sep 19, 2:24 pm, Alessio Stalla wrote: > On 19 Set, 19:34, Tim Daly wrote: > > > In common lisp I use the (disassemble) function which generally > > gives back an assembler listing of the code that would be executed. > > Is there a Java function which will return the byte codes that get > > executed? > > In general there isn't. In the particular situation in which a) the > Lisp implementation controls class loading and b) each Lisp function > is compiled to a distinct Java class, the implementation can arrange > to store the bytecode for each function and run a Java bytecode > decompiler on it to disassemble/decompile it. > > > Could this be used to create a (disassemble) function for > > Clojure? Having such a function means that you don't have to guess > > what the program is actually doing. > > I think Clojure respects point a) and b) above when not using gen- > class to compile multiple functions to a single Java class, so it > would be possible, but it requires support from the implementation, it > cannot be a library function. > > Cheers, > Alessio -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: why the big difference in speed?
On 19 Set, 19:34, Tim Daly wrote: > In common lisp I use the (disassemble) function which generally > gives back an assembler listing of the code that would be executed. > Is there a Java function which will return the byte codes that get > executed? In general there isn't. In the particular situation in which a) the Lisp implementation controls class loading and b) each Lisp function is compiled to a distinct Java class, the implementation can arrange to store the bytecode for each function and run a Java bytecode decompiler on it to disassemble/decompile it. > Could this be used to create a (disassemble) function for > Clojure? Having such a function means that you don't have to guess > what the program is actually doing. I think Clojure respects point a) and b) above when not using gen- class to compile multiple functions to a single Java class, so it would be possible, but it requires support from the implementation, it cannot be a library function. Cheers, Alessio -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: why the big difference in speed?
You can use "javap -c" out of the JDK to get the bytecodes that are handed to the VM. However, HotSpot does amazing things with the bytecodes after the code begins to run, so a disassembly can be quite misleading. I measure far more often than I view bytecode. Stu > In common lisp I use the (disassemble) function which generally > gives back an assembler listing of the code that would be executed. > Is there a Java function which will return the byte codes that get > executed? Could this be used to create a (disassemble) function for > Clojure? Having such a function means that you don't have to guess > what the program is actually doing. > > On 9/19/2010 1:19 PM, Ranjit wrote: >> Hi Nicholas, >> >> Thanks for the advice. I already tried (set! *warn-on-reflection* >> true), but it doesn't generate any warnings for any of these >> functions. >> >> I tried adding ^doubles in gaussian-matrix1 as you suggested but that >> didn't really change the speed at all. And changing next-gaussian to a >> macro didn't make much difference to gaussian-matrix1, and it doesn't >> work with gaussian-matrix2 at all and I'm not quite sure yet how to >> fix it so it does. >> >> I would have thought gaussian-matrix1 would be faster than gaussian- >> matrix2, but it turns out it's the opposite. Then I thought that the >> reason using aset was slower must have something to do with >> reflection, but that doesn't seem to be the case. >> >> So I'm a bit confused now. >> >> More generally though, the main reason I wanted to use Java arrays was >> because I have to use FFT's in my simulation and I figured it would be >> better to do everything in Java arrays rather than copy a Clojure >> vector into a 2d array and back every time I needed to do an FFT. But >> am I wrong in thinking that? >> >> Thanks, >> >> Ranjit >> >> >> >> On Sep 19, 11:20 am, Nicolas Oury wrote: >>> A first good start is to put >>> (set! *warn-on-relection* true) at the start of the file and removes >>> all reflective access. >>> >>> Before the 1.3 release, function cannot receive/returns primitive so >>> you might consider >>> (defmacro next-gaussian [] >>> `(.nextGaussian ^Random r)) >>> >>> (^Random is here to make sure r is seen with the right type) >>> >>> in both function, add ^doubles before arr at the binding point. >>> (defn ... [^doubles arr]) >>> >>> then set will not be rflective and be as fast as a set in java. >>> >>> Ask other questions of you need more help. >>> The best reference on all that: clojure.org/java_interop >>> >>> >>> >>> On Sun, Sep 19, 2010 at 3:29 PM, Ranjit wrote: Hi, I'm trying learn Clojure to see if I can use it in my simulations, and one thing I need to do is generate arrays of normally distributed numbers. I've been able to come up with the following two ways of doing this. gaussian-matrix2 is a lot faster than gaussian-matrix1, but I'm not sure why. And it's still slower than it should be I think. Is there anything I can do to speed this up still further? Thanks, -Ranjit (import java.util.Random) (def r (Random. )) (defn next-gaussian [] (.nextGaussian r)) (defn gaussian-matrix1 [arr L] (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian (defn gaussian-matrix2 [L] (into-array (map double-array (partition L (repeatedly (* L L) next-gaussian) -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en >>> -- >>> Sent from an IBM Model M, 15 August 1989. > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with your > first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: why the big difference in speed?
In common lisp I use the (disassemble) function which generally gives back an assembler listing of the code that would be executed. Is there a Java function which will return the byte codes that get executed? Could this be used to create a (disassemble) function for Clojure? Having such a function means that you don't have to guess what the program is actually doing. On 9/19/2010 1:19 PM, Ranjit wrote: Hi Nicholas, Thanks for the advice. I already tried (set! *warn-on-reflection* true), but it doesn't generate any warnings for any of these functions. I tried adding ^doubles in gaussian-matrix1 as you suggested but that didn't really change the speed at all. And changing next-gaussian to a macro didn't make much difference to gaussian-matrix1, and it doesn't work with gaussian-matrix2 at all and I'm not quite sure yet how to fix it so it does. I would have thought gaussian-matrix1 would be faster than gaussian- matrix2, but it turns out it's the opposite. Then I thought that the reason using aset was slower must have something to do with reflection, but that doesn't seem to be the case. So I'm a bit confused now. More generally though, the main reason I wanted to use Java arrays was because I have to use FFT's in my simulation and I figured it would be better to do everything in Java arrays rather than copy a Clojure vector into a 2d array and back every time I needed to do an FFT. But am I wrong in thinking that? Thanks, Ranjit On Sep 19, 11:20 am, Nicolas Oury wrote: A first good start is to put (set! *warn-on-relection* true) at the start of the file and removes all reflective access. Before the 1.3 release, function cannot receive/returns primitive so you might consider (defmacro next-gaussian [] `(.nextGaussian ^Random r)) (^Random is here to make sure r is seen with the right type) in both function, add ^doubles before arr at the binding point. (defn ... [^doubles arr]) then set will not be rflective and be as fast as a set in java. Ask other questions of you need more help. The best reference on all that: clojure.org/java_interop On Sun, Sep 19, 2010 at 3:29 PM, Ranjit wrote: Hi, I'm trying learn Clojure to see if I can use it in my simulations, and one thing I need to do is generate arrays of normally distributed numbers. I've been able to come up with the following two ways of doing this. gaussian-matrix2 is a lot faster than gaussian-matrix1, but I'm not sure why. And it's still slower than it should be I think. Is there anything I can do to speed this up still further? Thanks, -Ranjit (import java.util.Random) (def r (Random. )) (defn next-gaussian [] (.nextGaussian r)) (defn gaussian-matrix1 [arr L] (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian (defn gaussian-matrix2 [L] (into-array (map double-array (partition L (repeatedly (* L L) next-gaussian) -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- Sent from an IBM Model M, 15 August 1989. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: why the big difference in speed?
Hi Nicholas, Thanks for the advice. I already tried (set! *warn-on-reflection* true), but it doesn't generate any warnings for any of these functions. I tried adding ^doubles in gaussian-matrix1 as you suggested but that didn't really change the speed at all. And changing next-gaussian to a macro didn't make much difference to gaussian-matrix1, and it doesn't work with gaussian-matrix2 at all and I'm not quite sure yet how to fix it so it does. I would have thought gaussian-matrix1 would be faster than gaussian- matrix2, but it turns out it's the opposite. Then I thought that the reason using aset was slower must have something to do with reflection, but that doesn't seem to be the case. So I'm a bit confused now. More generally though, the main reason I wanted to use Java arrays was because I have to use FFT's in my simulation and I figured it would be better to do everything in Java arrays rather than copy a Clojure vector into a 2d array and back every time I needed to do an FFT. But am I wrong in thinking that? Thanks, Ranjit On Sep 19, 11:20 am, Nicolas Oury wrote: > A first good start is to put > (set! *warn-on-relection* true) at the start of the file and removes > all reflective access. > > Before the 1.3 release, function cannot receive/returns primitive so > you might consider > (defmacro next-gaussian [] > `(.nextGaussian ^Random r)) > > (^Random is here to make sure r is seen with the right type) > > in both function, add ^doubles before arr at the binding point. > (defn ... [^doubles arr]) > > then set will not be rflective and be as fast as a set in java. > > Ask other questions of you need more help. > The best reference on all that: clojure.org/java_interop > > > > On Sun, Sep 19, 2010 at 3:29 PM, Ranjit wrote: > > Hi, > > > I'm trying learn Clojure to see if I can use it in my simulations, and > > one thing I need to do is generate arrays of normally distributed > > numbers. > > > I've been able to come up with the following two ways of doing this. > > gaussian-matrix2 is a lot faster than gaussian-matrix1, but I'm not > > sure why. And it's still slower than it should be I think. Is there > > anything I can do to speed this up still further? > > > Thanks, > > > -Ranjit > > > (import java.util.Random) > > (def r (Random. )) > > > (defn next-gaussian [] (.nextGaussian r)) > > > (defn gaussian-matrix1 [arr L] > > (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian > > > (defn gaussian-matrix2 [L] > > (into-array (map double-array (partition L (repeatedly (* L L) > > next-gaussian) > > > -- > > You received this message because you are subscribed to the Google > > Groups "Clojure" group. > > To post to this group, send email to clojure@googlegroups.com > > Note that posts from new members are moderated - please be patient with > > your first post. > > To unsubscribe from this group, send email to > > clojure+unsubscr...@googlegroups.com > > For more options, visit this group at > >http://groups.google.com/group/clojure?hl=en > > -- > Sent from an IBM Model M, 15 August 1989. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: why the big difference in speed?
A first good start is to put (set! *warn-on-relection* true) at the start of the file and removes all reflective access. Before the 1.3 release, function cannot receive/returns primitive so you might consider (defmacro next-gaussian [] `(.nextGaussian ^Random r)) (^Random is here to make sure r is seen with the right type) in both function, add ^doubles before arr at the binding point. (defn ... [^doubles arr]) then set will not be rflective and be as fast as a set in java. Ask other questions of you need more help. The best reference on all that: clojure.org/java_interop On Sun, Sep 19, 2010 at 3:29 PM, Ranjit wrote: > Hi, > > I'm trying learn Clojure to see if I can use it in my simulations, and > one thing I need to do is generate arrays of normally distributed > numbers. > > I've been able to come up with the following two ways of doing this. > gaussian-matrix2 is a lot faster than gaussian-matrix1, but I'm not > sure why. And it's still slower than it should be I think. Is there > anything I can do to speed this up still further? > > Thanks, > > -Ranjit > > (import java.util.Random) > (def r (Random. )) > > (defn next-gaussian [] (.nextGaussian r)) > > (defn gaussian-matrix1 [arr L] > (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian > > (defn gaussian-matrix2 [L] > (into-array (map double-array (partition L (repeatedly (* L L) > next-gaussian) > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with your > first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en -- Sent from an IBM Model M, 15 August 1989. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
why the big difference in speed?
Hi, I'm trying learn Clojure to see if I can use it in my simulations, and one thing I need to do is generate arrays of normally distributed numbers. I've been able to come up with the following two ways of doing this. gaussian-matrix2 is a lot faster than gaussian-matrix1, but I'm not sure why. And it's still slower than it should be I think. Is there anything I can do to speed this up still further? Thanks, -Ranjit (import java.util.Random) (def r (Random. )) (defn next-gaussian [] (.nextGaussian r)) (defn gaussian-matrix1 [arr L] (doseq [x (range L) y (range L)] (aset arr x y (next-gaussian (defn gaussian-matrix2 [L] (into-array (map double-array (partition L (repeatedly (* L L) next-gaussian) -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en