Re: Help speed up an inner loop?
On Tue, Aug 31, 2010 at 8:43 PM, tsuraan wrote: > >> In this situation, inlining (int 10) does not buy much. > > interesting; for me replacing the 10 with (int 10) brings my times > from 28.7ms to 19.6ms. I meant putting (int 10) instead of nl and a let. Anyway, it seems that we can get to java speed, but with some hard work. I have read somewhere than 1.3 will help a lot to achieve this performance with less work. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Help speed up an inner loop?
> This one is quite good for me. > (defn countnl > [#^bytes buf] > (let [nl (int 10)] > (areduce buf idx count (int 0) > (if (== (int (aget buf idx)) nl) > (unchecked-inc count) > count > > > It appears that == is not resolved for bytes. So converting to int works fine. aha, converting to int works for me as well. With bytes it triggers the reflection warning and takes minutes to run. with the int conversion, it runs in an average of 28ms (just a little over java). Here's something really strange: I de-inlined the newline int, so my function looks like this: (defn countnl [#^bytes buf] (let [nl (int 10)] (areduce buf idx count (int 0) (if (== (int (aget buf idx)) nl) (unchecked-add count (int 1)) count On my machine, this is running in 19.6ms (after the warmup it's really constant). That's the exact speed as the java code, which is really cool. it actually works the same way if you just replace the inlined "10" with "(int 10)", and it makes it just as fast as the explicit looping code below. Somehow on my earlier tests the let binding was really slowing things down, but in this latest function it seems to have no effect. weird... > In this situation, inlining (int 10) does not buy much. interesting; for me replacing the 10 with (int 10) brings my times from 28.7ms to 19.6ms. > This one, though, is even faster and should be not far from the java > one, if you want to try it. > > (defn countnl > [#^bytes buf] > (let [nl (int 10) > n (int (alength buf))] > (loop [idx (int n) count (int 0)] > (if (zero? idx) > count > (let [idx (unchecked-dec idx)] > (if (== (int (aget buf idx)) nl) > (recur idx (unchecked-inc count)) > (recur idx count))) > > It would be interesting to know why it is nearly twice as fast as the > areduce version on my computer. That runs in 21.3ms for me, which actually makes it a tiny bit slower than the code above (and slightly slower than the java code). -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Help speed up an inner loop?
This one is quite good for me. (defn countnl [#^bytes buf] (let [nl (int 10)] (areduce buf idx count (int 0) (if (== (int (aget buf idx)) nl) (unchecked-inc count) count It appears that == is not resolved for bytes. So converting to int works fine. In this situation, inlining (int 10) does not buy much. This one, though, is even faster and should be not far from the java one, if you want to try it. (defn countnl [#^bytes buf] (let [nl (int 10) n (int (alength buf))] (loop [idx (int n) count (int 0)] (if (zero? idx) count (let [idx (unchecked-dec idx)] (if (== (int (aget buf idx)) nl) (recur idx (unchecked-inc count)) (recur idx count))) It would be interesting to know why it is nearly twice as fast as the areduce version on my computer. Best, Nicolas. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Help speed up an inner loop?
> Replace also (unchecked-add count 1) with (unchecked-add count (int 1)) > > (this should get easier in 1.3) That didn't change anything for my tests, but this code: (defn countnl [#^bytes buf] (areduce buf idx count (int 0) (if (= (aget buf idx) 10) (unchecked-add count 1) count))) does 34ms runs (the .java code does 20ms runs on the same machine). The biggest things that make a difference are replacing (inc count) with (unchecked-add count 1), initializing count to (int 0) rather than 0, and getting rid of the (let [nl (byte 0)]...). I was a bit disappointed that the let statement was so expensive, since I really prefer seeing a variable to a magic number. Has somebody written a macro that can be used to rewrite code replacing variable references with their constant values? It would be of limited use (constants only), but it would make the code prettier :) For posterity, here are some variations on countnl and their timings on my machine: all runs are invoked with this command: "java -server -cp clojure.jar clojure.main iterate.clj"; clojure.jar is from the clojure-1.2.zip distribution (defn countnl [#^bytes buf] (let [nl (byte 10)] (areduce buf idx count 0 (if (= (aget buf idx) nl) (inc count) count Wanted 16777216 got 16777216 bytes "Elapsed time: 184.172834 msecs" Got 65875 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 193.546018 msecs" Got 64995 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 142.546453 msecs" Got 65677 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 135.843933 msecs" Got 66117 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 134.882156 msecs" Got 65449 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 134.864881 msecs" Got 65393 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 134.854118 msecs" Got 65065 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 134.82848 msecs" Got 65346 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 134.803702 msecs" Got 65783 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 134.832489 msecs" Got 65566 nls (defn countnl [#^bytes buf] (let [nl (byte 10)] (areduce buf idx count (int 0) (if (= (aget buf idx) nl) (inc count) count Wanted 16777216 got 16777216 bytes "Elapsed time: 238.409697 msecs" Got 65698 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 208.872818 msecs" Got 65381 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 88.786986 msecs" Got 65358 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 88.76816 msecs" Got 65686 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 88.899431 msecs" Got 65683 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 88.783275 msecs" Got 65491 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 88.735416 msecs" Got 65749 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 88.762987 msecs" Got 65837 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 88.749591 msecs" Got 65759 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 88.72971 msecs" Got 65366 nls (defn countnl [#^bytes buf] (areduce buf idx count (int 0) (if (= (aget buf idx) 10) (inc count) count))) Wanted 16777216 got 16777216 bytes "Elapsed time: 185.465613 msecs" Got 65344 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 186.856418 msecs" Got 65529 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 73.778183 msecs" Got 65662 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 73.675509 msecs" Got 65034 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 73.674777 msecs" Got 65459 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 73.630553 msecs" Got 65172 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 73.668685 msecs" Got 64939 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 73.74801 msecs" Got 65462 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 73.699187 msecs" Got 65602 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 73.611555 msecs" Got 64937 nls (defn countnl [#^bytes buf] (areduce buf idx count (int 0) (if (= (aget buf idx) 10) (unchecked-add count 1) count))) Wanted 16777216 got 16777216 bytes "Elapsed time: 82.865761 msecs" Got 65266 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 34.228646 msecs" Got 65522 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 33.989741 msecs" Got 65537 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 33.947045 msecs" Got 65688 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 33.941481 msecs" Got 65395 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 33.952149 msecs" Got 65822 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 33.972258 msecs" Got 65495 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 33.903989 msecs" Got 65244 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 33.927276 msecs" Got 65331 nls Wanted 1
Re: Help speed up an inner loop?
Replace also (unchecked-add count 1) with (unchecked-add count (int 1)) (this should get easier in 1.3) On Tue, Aug 31, 2010 at 4:20 PM, tsuraan wrote: >> (defn countnl-lite >> [#^bytes buf] >> (areduce buf idx count (int 0) >> (if (= (clojure.lang.RT/aget buf idx) 10) >> (unchecked-add count 1) >> count))) >> >> Key points are initializing count to a primitive integer and directly >> calling clojure's aget to avoid an unnecessary integer cast. > > Are you using clojure 1.2? If I try to set count to be (int 0) rather > than 0, I get this error: > > Exception in thread "main" java.lang.RuntimeException: > java.lang.IllegalArgumentException: recur arg for primitive local: > count must be matching primitive > > Even if I replace "inc" with the unchecked-add, or "cast" the result > of the inc to be "int" (replace (inc count) with (int (inc count)) ) > it gives me that error. Sort of strange... > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with your > first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Help speed up an inner loop?
> (defn countnl-lite > [#^bytes buf] > (areduce buf idx count (int 0) > (if (= (clojure.lang.RT/aget buf idx) 10) > (unchecked-add count 1) > count))) > > Key points are initializing count to a primitive integer and directly > calling clojure's aget to avoid an unnecessary integer cast. Are you using clojure 1.2? If I try to set count to be (int 0) rather than 0, I get this error: Exception in thread "main" java.lang.RuntimeException: java.lang.IllegalArgumentException: recur arg for primitive local: count must be matching primitive Even if I replace "inc" with the unchecked-add, or "cast" the result of the inc to be "int" (replace (inc count) with (int (inc count)) ) it gives me that error. Sort of strange... -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Help speed up an inner loop?
On Tue, Aug 31, 2010 at 1:53 PM, Nicolas Oury wrote: > I am not convince that (make-array Byte/TYPE *numbytes*) creates an > array of primitives. Actually it does, sorry for the noise. Should check before sending emails. Best, Nicolas. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Help speed up an inner loop?
I am not convince that (make-array Byte/TYPE *numbytes*) creates an array of primitives. And I think byte [] is an array of primitives. That would make a difference. I don't know if clojure has a byte-array. It seems that there is no byte-array as there is int-array. Could you try your code with int-array, or could you just write two static methods in a Java class to read and write an int from a byte array? Best, Nicolas. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Help speed up an inner loop?
Hi, On 31 Aug., 08:46, Robert McIntyre wrote: > Without AOT compilation countnl-lite takes around 66 msecs > With AOT compilation countnl-lite takes ~46 msecs Did you measure start-up time in your runs? AOT compilation should have no impact on the runtime speed. Sincerely Meikel -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: Help speed up an inner loop?
Consider trying to use "==" in place of where you have "=", which can be faster when comparing numbers for equality. Source for this and a few other performance tips: http://gnuvince.wordpress.com/2009/05/11/clojure-performance-tips/ Andy On Mon, Aug 30, 2010 at 11:46 PM, Robert McIntyre wrote: > Ah, I see that I was mistaken about the timing. Sorry about that. > > After a lot of fiddling around, I cam up with this faster form: > > (defn countnl-lite > [#^bytes buf] > (areduce buf idx count (int 0) > (if (= (clojure.lang.RT/aget buf idx) 10) > (unchecked-add count 1) > count))) > > Key points are initializing count to a primitive integer and directly > calling clojure's aget to avoid an unnecessary integer cast. > > On my system: > > The unmodified countnl function takes ~ 180 msecs > > Without AOT compilation countnl-lite takes around 66 msecs > > With AOT compilation countnl-lite takes ~46 msecs > > The java method takes ~19 msecs. > > I've lost a factor of 2.25 somewhere and it makes me sad that I can't find > it. > I would be very interested if anyone could improve countnl-lite. > > --Robert McIntyre > > > > On Mon, Aug 30, 2010 at 8:41 PM, Alan wrote: > > I think this misses the point. Of course java, c, and clojure will all > > have roughly the same wall-clock time for this program, since it is > > dominated by the I/O. You can even see that in the output from $ time > > java Iterate: less than 0.5s was spent in user space, the rest was > > spent in system code - that is, mostly doing I/O. > > > > The java version is a second faster as counted by the wall clock, and > > this is unlikely to be a coincidence: tsuraan's timing data suggests > > that the clojure program takes 80ms longer in each loop, and loops 10 > > times. That comes out to 0.8 seconds, which is quite close to the > > differential you observed when timing from the command line. > > > > On Aug 30, 1:38 pm, Robert McIntyre wrote: > >> I don't know what the heck is going here, but ignore the time the > >> program is reporting and just > >> pay attention to how long it actually takes wall-clock style and > >> you'll see that your clojure and > >> java programs already take the same time. > >> > >> Here are my findings: > >> > >> I saved Iterate.java into my rlm package and ran: > >> time java -server rlm.Iterate > >> > >> results: > >> time java -server rlm.Iterate > >> Wanted 16777216 got 16777216 bytes > >> counted 65341 nls in 27 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65310 nls in 27 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 66026 nls in 21 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65473 nls in 19 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65679 nls in 19 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65739 nls in 19 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65310 nls in 21 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65810 nls in 18 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65531 nls in 21 msec > >> Wanted 16777216 got 16777216 bytes > >> counted 65418 nls in 21 msec > >> > >> real0m27.469s > >> user0m0.472s > >> sys 0m26.638s > >> > >> I wrapped the last bunch of commands in your clojure script into a > >> (run) function: > >> (defn run [] > >> (let [ifs (FileInputStream. "/dev/urandom") > >> buf (make-array Byte/TYPE *numbytes*)] > >> (dotimes [_ 10] > >> (let [sz (.read ifs buf)] > >> (println "Wanted" *numbytes* "got" sz "bytes") > >> (let [count (time (countnl buf))] > >> (println "Got" count "nls")) > >> > >> and ran > >> (time (run)) at the repl: > >> > >> (time (run)) > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.081975 msecs" > >> Got 65894 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.001814 msecs" > >> Got 65949 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.061934 msecs" > >> Got 65603 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.031131 msecs" > >> Got 65563 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.122567 msecs" > >> Got 65696 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 182.968066 msecs" > >> Got 65546 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.058508 msecs" > >> Got 65468 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 182.932395 msecs" > >> Got 65872 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 183.074646 msecs" > >> Got 65498 nls > >> Wanted 16777216 got 16777216 bytes > >> "Elapsed time: 187.733636 msecs" > >> Got 65434 nls > >> "Elapsed time: 28510.331507 msecs" > >> nil > >> > >> Total running time for both programs is around 28 seconds. > >> The java program seems to be incorrectly reporting it's time. > >> > >> --Robert McIntyre > >> > >> On Mon, Aug 30, 2010 at 4:03 PM, tsuraan wrote: > >> > Just to try
Re: Help speed up an inner loop?
Ah, I see that I was mistaken about the timing. Sorry about that. After a lot of fiddling around, I cam up with this faster form: (defn countnl-lite [#^bytes buf] (areduce buf idx count (int 0) (if (= (clojure.lang.RT/aget buf idx) 10) (unchecked-add count 1) count))) Key points are initializing count to a primitive integer and directly calling clojure's aget to avoid an unnecessary integer cast. On my system: The unmodified countnl function takes ~ 180 msecs Without AOT compilation countnl-lite takes around 66 msecs With AOT compilation countnl-lite takes ~46 msecs The java method takes ~19 msecs. I've lost a factor of 2.25 somewhere and it makes me sad that I can't find it. I would be very interested if anyone could improve countnl-lite. --Robert McIntyre On Mon, Aug 30, 2010 at 8:41 PM, Alan wrote: > I think this misses the point. Of course java, c, and clojure will all > have roughly the same wall-clock time for this program, since it is > dominated by the I/O. You can even see that in the output from $ time > java Iterate: less than 0.5s was spent in user space, the rest was > spent in system code - that is, mostly doing I/O. > > The java version is a second faster as counted by the wall clock, and > this is unlikely to be a coincidence: tsuraan's timing data suggests > that the clojure program takes 80ms longer in each loop, and loops 10 > times. That comes out to 0.8 seconds, which is quite close to the > differential you observed when timing from the command line. > > On Aug 30, 1:38 pm, Robert McIntyre wrote: >> I don't know what the heck is going here, but ignore the time the >> program is reporting and just >> pay attention to how long it actually takes wall-clock style and >> you'll see that your clojure and >> java programs already take the same time. >> >> Here are my findings: >> >> I saved Iterate.java into my rlm package and ran: >> time java -server rlm.Iterate >> >> results: >> time java -server rlm.Iterate >> Wanted 16777216 got 16777216 bytes >> counted 65341 nls in 27 msec >> Wanted 16777216 got 16777216 bytes >> counted 65310 nls in 27 msec >> Wanted 16777216 got 16777216 bytes >> counted 66026 nls in 21 msec >> Wanted 16777216 got 16777216 bytes >> counted 65473 nls in 19 msec >> Wanted 16777216 got 16777216 bytes >> counted 65679 nls in 19 msec >> Wanted 16777216 got 16777216 bytes >> counted 65739 nls in 19 msec >> Wanted 16777216 got 16777216 bytes >> counted 65310 nls in 21 msec >> Wanted 16777216 got 16777216 bytes >> counted 65810 nls in 18 msec >> Wanted 16777216 got 16777216 bytes >> counted 65531 nls in 21 msec >> Wanted 16777216 got 16777216 bytes >> counted 65418 nls in 21 msec >> >> real 0m27.469s >> user 0m0.472s >> sys 0m26.638s >> >> I wrapped the last bunch of commands in your clojure script into a >> (run) function: >> (defn run [] >> (let [ifs (FileInputStream. "/dev/urandom") >> buf (make-array Byte/TYPE *numbytes*)] >> (dotimes [_ 10] >> (let [sz (.read ifs buf)] >> (println "Wanted" *numbytes* "got" sz "bytes") >> (let [count (time (countnl buf))] >> (println "Got" count "nls")) >> >> and ran >> (time (run)) at the repl: >> >> (time (run)) >> Wanted 16777216 got 16777216 bytes >> "Elapsed time: 183.081975 msecs" >> Got 65894 nls >> Wanted 16777216 got 16777216 bytes >> "Elapsed time: 183.001814 msecs" >> Got 65949 nls >> Wanted 16777216 got 16777216 bytes >> "Elapsed time: 183.061934 msecs" >> Got 65603 nls >> Wanted 16777216 got 16777216 bytes >> "Elapsed time: 183.031131 msecs" >> Got 65563 nls >> Wanted 16777216 got 16777216 bytes >> "Elapsed time: 183.122567 msecs" >> Got 65696 nls >> Wanted 16777216 got 16777216 bytes >> "Elapsed time: 182.968066 msecs" >> Got 65546 nls >> Wanted 16777216 got 16777216 bytes >> "Elapsed time: 183.058508 msecs" >> Got 65468 nls >> Wanted 16777216 got 16777216 bytes >> "Elapsed time: 182.932395 msecs" >> Got 65872 nls >> Wanted 16777216 got 16777216 bytes >> "Elapsed time: 183.074646 msecs" >> Got 65498 nls >> Wanted 16777216 got 16777216 bytes >> "Elapsed time: 187.733636 msecs" >> Got 65434 nls >> "Elapsed time: 28510.331507 msecs" >> nil >> >> Total running time for both programs is around 28 seconds. >> The java program seems to be incorrectly reporting it's time. >> >> --Robert McIntyre >> >> On Mon, Aug 30, 2010 at 4:03 PM, tsuraan wrote: >> > Just to try to see if clojure is a practical language for doing >> > byte-level work (parsing files, network streams, etc), I wrote a >> > trivial function to iterate through a buffer of bytes and count all >> > the newlines that it sees. For my testing, I've written a C version, >> > a Java version, and a Clojure version. I'm running each routine 10 >> > times over a 16MB buffer read from /dev/urandom (the buffer is >> > refreshed between each call to the newline counting function). With >> > gcc -O0, I get about 80ms per 16MB buffer. With gcc -O3, I get ~14ms >
Re: Help speed up an inner loop?
I think this misses the point. Of course java, c, and clojure will all have roughly the same wall-clock time for this program, since it is dominated by the I/O. You can even see that in the output from $ time java Iterate: less than 0.5s was spent in user space, the rest was spent in system code - that is, mostly doing I/O. The java version is a second faster as counted by the wall clock, and this is unlikely to be a coincidence: tsuraan's timing data suggests that the clojure program takes 80ms longer in each loop, and loops 10 times. That comes out to 0.8 seconds, which is quite close to the differential you observed when timing from the command line. On Aug 30, 1:38 pm, Robert McIntyre wrote: > I don't know what the heck is going here, but ignore the time the > program is reporting and just > pay attention to how long it actually takes wall-clock style and > you'll see that your clojure and > java programs already take the same time. > > Here are my findings: > > I saved Iterate.java into my rlm package and ran: > time java -server rlm.Iterate > > results: > time java -server rlm.Iterate > Wanted 16777216 got 16777216 bytes > counted 65341 nls in 27 msec > Wanted 16777216 got 16777216 bytes > counted 65310 nls in 27 msec > Wanted 16777216 got 16777216 bytes > counted 66026 nls in 21 msec > Wanted 16777216 got 16777216 bytes > counted 65473 nls in 19 msec > Wanted 16777216 got 16777216 bytes > counted 65679 nls in 19 msec > Wanted 16777216 got 16777216 bytes > counted 65739 nls in 19 msec > Wanted 16777216 got 16777216 bytes > counted 65310 nls in 21 msec > Wanted 16777216 got 16777216 bytes > counted 65810 nls in 18 msec > Wanted 16777216 got 16777216 bytes > counted 65531 nls in 21 msec > Wanted 16777216 got 16777216 bytes > counted 65418 nls in 21 msec > > real 0m27.469s > user 0m0.472s > sys 0m26.638s > > I wrapped the last bunch of commands in your clojure script into a > (run) function: > (defn run [] > (let [ifs (FileInputStream. "/dev/urandom") > buf (make-array Byte/TYPE *numbytes*)] > (dotimes [_ 10] > (let [sz (.read ifs buf)] > (println "Wanted" *numbytes* "got" sz "bytes") > (let [count (time (countnl buf))] > (println "Got" count "nls")) > > and ran > (time (run)) at the repl: > > (time (run)) > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.081975 msecs" > Got 65894 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.001814 msecs" > Got 65949 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.061934 msecs" > Got 65603 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.031131 msecs" > Got 65563 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.122567 msecs" > Got 65696 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 182.968066 msecs" > Got 65546 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.058508 msecs" > Got 65468 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 182.932395 msecs" > Got 65872 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 183.074646 msecs" > Got 65498 nls > Wanted 16777216 got 16777216 bytes > "Elapsed time: 187.733636 msecs" > Got 65434 nls > "Elapsed time: 28510.331507 msecs" > nil > > Total running time for both programs is around 28 seconds. > The java program seems to be incorrectly reporting it's time. > > --Robert McIntyre > > On Mon, Aug 30, 2010 at 4:03 PM, tsuraan wrote: > > Just to try to see if clojure is a practical language for doing > > byte-level work (parsing files, network streams, etc), I wrote a > > trivial function to iterate through a buffer of bytes and count all > > the newlines that it sees. For my testing, I've written a C version, > > a Java version, and a Clojure version. I'm running each routine 10 > > times over a 16MB buffer read from /dev/urandom (the buffer is > > refreshed between each call to the newline counting function). With > > gcc -O0, I get about 80ms per 16MB buffer. With gcc -O3, I get ~14ms > > per buffer. With javac (and java -server) I get 20ms per 16MB buffer. > > With clojure, I get 105ms per buffer (after the jvm warms up). I'm > > guessing that the huge boost that java and gcc -O3 get is from > > converting per-byte operations to per-int ops; at least that ~4x boost > > looks like it would come from something like that. Is that an > > optimization that is unavailable to clojure? The java_interop doc > > makes it sound like java and clojure get the exact same bytecode when > > using areduce correctly, so maybe there's something I could be doing > > better. Here are my small programs; if somebody could suggest > > improvements, I'd appreciate them. > > > iterate.clj: > > > (set! *warn-on-reflection* true) > > (import java.io.FileInputStream) > > > (def *numbytes* (* 16 1024 1024)) > > > (defn countnl > > [#^bytes buf] > > (let [nl (byte 10)] > > (areduce buf idx count 0 > > (if (= (aget buf idx) nl) > > (inc count) > >
Re: Help speed up an inner loop?
I don't know what the heck is going here, but ignore the time the program is reporting and just pay attention to how long it actually takes wall-clock style and you'll see that your clojure and java programs already take the same time. Here are my findings: I saved Iterate.java into my rlm package and ran: time java -server rlm.Iterate results: time java -server rlm.Iterate Wanted 16777216 got 16777216 bytes counted 65341 nls in 27 msec Wanted 16777216 got 16777216 bytes counted 65310 nls in 27 msec Wanted 16777216 got 16777216 bytes counted 66026 nls in 21 msec Wanted 16777216 got 16777216 bytes counted 65473 nls in 19 msec Wanted 16777216 got 16777216 bytes counted 65679 nls in 19 msec Wanted 16777216 got 16777216 bytes counted 65739 nls in 19 msec Wanted 16777216 got 16777216 bytes counted 65310 nls in 21 msec Wanted 16777216 got 16777216 bytes counted 65810 nls in 18 msec Wanted 16777216 got 16777216 bytes counted 65531 nls in 21 msec Wanted 16777216 got 16777216 bytes counted 65418 nls in 21 msec real0m27.469s user0m0.472s sys 0m26.638s I wrapped the last bunch of commands in your clojure script into a (run) function: (defn run [] (let [ifs (FileInputStream. "/dev/urandom") buf (make-array Byte/TYPE *numbytes*)] (dotimes [_ 10] (let [sz (.read ifs buf)] (println "Wanted" *numbytes* "got" sz "bytes") (let [count (time (countnl buf))] (println "Got" count "nls")) and ran (time (run)) at the repl: (time (run)) Wanted 16777216 got 16777216 bytes "Elapsed time: 183.081975 msecs" Got 65894 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 183.001814 msecs" Got 65949 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 183.061934 msecs" Got 65603 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 183.031131 msecs" Got 65563 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 183.122567 msecs" Got 65696 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 182.968066 msecs" Got 65546 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 183.058508 msecs" Got 65468 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 182.932395 msecs" Got 65872 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 183.074646 msecs" Got 65498 nls Wanted 16777216 got 16777216 bytes "Elapsed time: 187.733636 msecs" Got 65434 nls "Elapsed time: 28510.331507 msecs" nil Total running time for both programs is around 28 seconds. The java program seems to be incorrectly reporting it's time. --Robert McIntyre On Mon, Aug 30, 2010 at 4:03 PM, tsuraan wrote: > Just to try to see if clojure is a practical language for doing > byte-level work (parsing files, network streams, etc), I wrote a > trivial function to iterate through a buffer of bytes and count all > the newlines that it sees. For my testing, I've written a C version, > a Java version, and a Clojure version. I'm running each routine 10 > times over a 16MB buffer read from /dev/urandom (the buffer is > refreshed between each call to the newline counting function). With > gcc -O0, I get about 80ms per 16MB buffer. With gcc -O3, I get ~14ms > per buffer. With javac (and java -server) I get 20ms per 16MB buffer. > With clojure, I get 105ms per buffer (after the jvm warms up). I'm > guessing that the huge boost that java and gcc -O3 get is from > converting per-byte operations to per-int ops; at least that ~4x boost > looks like it would come from something like that. Is that an > optimization that is unavailable to clojure? The java_interop doc > makes it sound like java and clojure get the exact same bytecode when > using areduce correctly, so maybe there's something I could be doing > better. Here are my small programs; if somebody could suggest > improvements, I'd appreciate them. > > iterate.clj: > > (set! *warn-on-reflection* true) > (import java.io.FileInputStream) > > (def *numbytes* (* 16 1024 1024)) > > (defn countnl > [#^bytes buf] > (let [nl (byte 10)] > (areduce buf idx count 0 > (if (= (aget buf idx) nl) > (inc count) > count > > (let [ifs (FileInputStream. "/dev/urandom") > buf (make-array Byte/TYPE *numbytes*)] > (dotimes [_ 10] > (let [sz (.read ifs buf)] > (println "Wanted" *numbytes* "got" sz "bytes") > (let [count (time (countnl buf))] > (println "Got" count "nls") > > > Iterate.java: > > import java.io.FileInputStream; > > class Iterate > { > static final int NUMBYTES = 16*1024*1024; > > static int countnl(byte[] buf) > { > int count = 0; > for(int i = 0; i < buf.length; i++) { > if(buf[i] == '\n') { > count++; > } > } > return count; > } > > public static final void main(String[] args) > throws Throwable > { > FileInputStream input = new FileInputStream("/dev/urandom"); > byte[] buf = new byte[NUMBYTES]; > int sz; > long start, end; > > for(int i = 0; i < 10; i++) { > sz = input.read(buf); > System.out.println("Wanted " +