Re: Help speed up an inner loop?

2010-08-31 Thread Nicolas Oury
On Tue, Aug 31, 2010 at 8:43 PM, tsuraan  wrote:
>
>> In this situation, inlining (int 10) does not buy much.
>
> interesting; for me replacing the 10 with (int 10) brings my times
> from 28.7ms to 19.6ms.

I meant putting (int 10) instead of nl and a let.

Anyway, it seems that we can get to java speed, but with some hard work.

I have read somewhere than 1.3 will help a lot to achieve this
performance with less work.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Help speed up an inner loop?

2010-08-31 Thread tsuraan
> This one is quite good for me.
> (defn countnl
>  [#^bytes buf]
>  (let [nl (int 10)]
>   (areduce buf idx count (int 0)
>            (if (== (int (aget buf idx)) nl)
>              (unchecked-inc count)
>              count
>
>
> It appears that == is not resolved for bytes. So converting to int works fine.

aha, converting to int works for me as well.  With bytes it triggers
the reflection warning and takes minutes to run.  with the int
conversion, it runs in an average of 28ms (just a little over java).
Here's something really strange: I de-inlined the newline int, so my
function looks like this:

(defn countnl
  [#^bytes buf]
  (let [nl (int 10)]
(areduce buf idx count (int 0)
 (if (== (int (aget buf idx)) nl)
   (unchecked-add count (int 1))
   count

On my machine, this is running in 19.6ms (after the warmup it's really
constant).  That's the exact speed as the java code, which is really
cool.  it actually works the same way if you just replace the inlined
"10" with "(int 10)", and it makes it just as fast as the explicit
looping code below.  Somehow on my earlier tests the let binding was
really slowing things down, but in this latest function it seems to
have no effect.  weird...

> In this situation, inlining (int 10) does not buy much.

interesting; for me replacing the 10 with (int 10) brings my times
from 28.7ms to 19.6ms.

> This one, though, is even faster and should be not far from the java
> one, if you want to try it.
>
> (defn countnl
>  [#^bytes buf]
>  (let [nl (int 10)
>        n (int (alength buf))]
>   (loop [idx (int n) count (int 0)]
>         (if (zero? idx)
>             count
>             (let [idx (unchecked-dec idx)]
>               (if (== (int (aget buf idx)) nl)
>                 (recur  idx (unchecked-inc count))
>                 (recur  idx count)))
>
> It would be interesting to know why it is nearly twice as fast as the
> areduce version on my computer.

That runs in 21.3ms for me, which actually makes it a tiny bit slower
than the code above (and slightly slower than the java code).

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Help speed up an inner loop?

2010-08-31 Thread Nicolas Oury
This one is quite good for me.
(defn countnl
 [#^bytes buf]
 (let [nl (int 10)]
   (areduce buf idx count (int 0)
(if (== (int (aget buf idx)) nl)
  (unchecked-inc count)
  count


It appears that == is not resolved for bytes. So converting to int works fine.

In this situation, inlining (int 10) does not buy much.

This one, though, is even faster and should be not far from the java
one, if you want to try it.

(defn countnl
 [#^bytes buf]
 (let [nl (int 10)
n (int (alength buf))]
   (loop [idx (int n) count (int 0)]
 (if (zero? idx)
 count
 (let [idx (unchecked-dec idx)]
   (if (== (int (aget buf idx)) nl)
 (recur  idx (unchecked-inc count))
 (recur  idx count)))

It would be interesting to know why it is nearly twice as fast as the
areduce version on my computer.

Best,

Nicolas.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Help speed up an inner loop?

2010-08-31 Thread tsuraan
> Replace also  (unchecked-add count 1) with  (unchecked-add count (int 1))
>
> (this should get easier in 1.3)

That didn't change anything for my tests, but this code:

(defn countnl
  [#^bytes buf]
  (areduce buf idx count (int 0)
   (if (= (aget buf idx) 10)
 (unchecked-add count 1)
 count)))

does 34ms runs (the .java code does 20ms runs on the same machine).
The biggest things that make a difference are replacing (inc count)
with (unchecked-add count 1), initializing count to (int 0) rather
than 0, and getting rid of the (let [nl (byte 0)]...).  I was a bit
disappointed that the let statement was so expensive, since I really
prefer seeing a variable to a magic number.  Has somebody written a
macro that can be used to rewrite code replacing variable references
with their constant values?  It would be of limited use (constants
only), but it would make the code prettier :)  For posterity, here are
some variations on countnl and their timings on my machine:

all runs are invoked with this command: "java -server -cp clojure.jar
clojure.main iterate.clj"; clojure.jar is from the clojure-1.2.zip
distribution


(defn countnl
  [#^bytes buf]
  (let [nl (byte 10)]
(areduce buf idx count 0
 (if (= (aget buf idx) nl)
   (inc count)
   count

Wanted 16777216 got 16777216 bytes
"Elapsed time: 184.172834 msecs"
Got 65875 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 193.546018 msecs"
Got 64995 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 142.546453 msecs"
Got 65677 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 135.843933 msecs"
Got 66117 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 134.882156 msecs"
Got 65449 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 134.864881 msecs"
Got 65393 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 134.854118 msecs"
Got 65065 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 134.82848 msecs"
Got 65346 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 134.803702 msecs"
Got 65783 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 134.832489 msecs"
Got 65566 nls

(defn countnl
  [#^bytes buf]
  (let [nl (byte 10)]
(areduce buf idx count (int 0)
 (if (= (aget buf idx) nl)
   (inc count)
   count

Wanted 16777216 got 16777216 bytes
"Elapsed time: 238.409697 msecs"
Got 65698 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 208.872818 msecs"
Got 65381 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 88.786986 msecs"
Got 65358 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 88.76816 msecs"
Got 65686 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 88.899431 msecs"
Got 65683 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 88.783275 msecs"
Got 65491 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 88.735416 msecs"
Got 65749 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 88.762987 msecs"
Got 65837 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 88.749591 msecs"
Got 65759 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 88.72971 msecs"
Got 65366 nls

(defn countnl
  [#^bytes buf]
  (areduce buf idx count (int 0)
   (if (= (aget buf idx) 10)
 (inc count)
 count)))

Wanted 16777216 got 16777216 bytes
"Elapsed time: 185.465613 msecs"
Got 65344 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 186.856418 msecs"
Got 65529 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 73.778183 msecs"
Got 65662 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 73.675509 msecs"
Got 65034 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 73.674777 msecs"
Got 65459 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 73.630553 msecs"
Got 65172 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 73.668685 msecs"
Got 64939 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 73.74801 msecs"
Got 65462 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 73.699187 msecs"
Got 65602 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 73.611555 msecs"
Got 64937 nls

(defn countnl
  [#^bytes buf]
  (areduce buf idx count (int 0)
   (if (= (aget buf idx) 10)
 (unchecked-add count 1)
 count)))

Wanted 16777216 got 16777216 bytes
"Elapsed time: 82.865761 msecs"
Got 65266 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 34.228646 msecs"
Got 65522 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 33.989741 msecs"
Got 65537 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 33.947045 msecs"
Got 65688 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 33.941481 msecs"
Got 65395 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 33.952149 msecs"
Got 65822 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 33.972258 msecs"
Got 65495 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 33.903989 msecs"
Got 65244 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 33.927276 msecs"
Got 65331 nls
Wanted 1

Re: Help speed up an inner loop?

2010-08-31 Thread Nicolas Oury
Replace also  (unchecked-add count 1) with  (unchecked-add count (int 1))

(this should get easier in 1.3)

On Tue, Aug 31, 2010 at 4:20 PM, tsuraan  wrote:
>> (defn countnl-lite
>>  [#^bytes buf]
>>  (areduce buf idx count (int 0)
>>           (if (= (clojure.lang.RT/aget buf idx) 10)
>>             (unchecked-add count 1)
>>             count)))
>>
>> Key points are initializing count to a primitive integer and directly
>> calling clojure's aget to avoid an unnecessary integer cast.
>
> Are you using clojure 1.2?  If I try to set count to be (int 0) rather
> than 0, I get this error:
>
> Exception in thread "main" java.lang.RuntimeException:
> java.lang.IllegalArgumentException: recur arg for primitive local:
> count must be matching primitive
>
> Even if I replace "inc" with the unchecked-add, or "cast" the result
> of the inc to be "int" (replace (inc count) with (int (inc count)) )
> it gives me that error.  Sort of strange...
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Help speed up an inner loop?

2010-08-31 Thread tsuraan
> (defn countnl-lite
>  [#^bytes buf]
>  (areduce buf idx count (int 0)
>           (if (= (clojure.lang.RT/aget buf idx) 10)
>             (unchecked-add count 1)
>             count)))
>
> Key points are initializing count to a primitive integer and directly
> calling clojure's aget to avoid an unnecessary integer cast.

Are you using clojure 1.2?  If I try to set count to be (int 0) rather
than 0, I get this error:

Exception in thread "main" java.lang.RuntimeException:
java.lang.IllegalArgumentException: recur arg for primitive local:
count must be matching primitive

Even if I replace "inc" with the unchecked-add, or "cast" the result
of the inc to be "int" (replace (inc count) with (int (inc count)) )
it gives me that error.  Sort of strange...

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Help speed up an inner loop?

2010-08-31 Thread Nicolas Oury
On Tue, Aug 31, 2010 at 1:53 PM, Nicolas Oury  wrote:
> I am not convince that (make-array Byte/TYPE *numbytes*) creates an
> array of primitives.
Actually it does, sorry for the noise.

Should check before sending emails.

Best,

Nicolas.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Help speed up an inner loop?

2010-08-31 Thread Nicolas Oury
I am not convince that (make-array Byte/TYPE *numbytes*) creates an
array of primitives.
And I think byte [] is an array of primitives.

That would make a difference.

I don't know if clojure has a byte-array. It seems that there is no
byte-array as there is int-array.

Could you try your code with int-array, or could you just write two
static methods in a Java class to read and write an int from a byte
array?

Best,

Nicolas.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Help speed up an inner loop?

2010-08-31 Thread Meikel Brandmeyer
Hi,

On 31 Aug., 08:46, Robert McIntyre  wrote:

> Without AOT compilation countnl-lite takes around 66 msecs
> With AOT compilation countnl-lite takes ~46 msecs

Did you measure start-up time in your runs? AOT compilation should
have no impact on the runtime speed.

Sincerely
Meikel

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Help speed up an inner loop?

2010-08-31 Thread John Fingerhut
Consider trying to use "==" in place of where you have "=", which can be
faster when comparing numbers for equality.  Source for this and a few other
performance tips:

http://gnuvince.wordpress.com/2009/05/11/clojure-performance-tips/

Andy

On Mon, Aug 30, 2010 at 11:46 PM, Robert McIntyre  wrote:

> Ah, I see that I was mistaken about the timing. Sorry about that.
>
> After a lot of fiddling around, I cam up with this faster form:
>
> (defn countnl-lite
>  [#^bytes buf]
>  (areduce buf idx count (int 0)
>   (if (= (clojure.lang.RT/aget buf idx) 10)
> (unchecked-add count 1)
> count)))
>
> Key points are initializing count to a primitive integer and directly
> calling clojure's aget to avoid an unnecessary integer cast.
>
> On my system:
>
> The unmodified countnl function takes ~ 180 msecs
>
> Without AOT compilation countnl-lite takes around 66 msecs
>
> With AOT compilation countnl-lite takes ~46 msecs
>
> The java method takes ~19 msecs.
>
> I've lost a factor of 2.25 somewhere and it makes me sad that I can't find
> it.
> I would be very interested if anyone could improve countnl-lite.
>
> --Robert McIntyre
>
>
>
> On Mon, Aug 30, 2010 at 8:41 PM, Alan  wrote:
> > I think this misses the point. Of course java, c, and clojure will all
> > have roughly the same wall-clock time for this program, since it is
> > dominated by the I/O. You can even see that in the output from $ time
> > java Iterate: less than 0.5s was spent in user space, the rest was
> > spent in system code - that is, mostly doing I/O.
> >
> > The java version is a second faster as counted by the wall clock, and
> > this is unlikely to be a coincidence: tsuraan's timing data suggests
> > that the clojure program takes 80ms longer in each loop, and loops 10
> > times. That comes out to 0.8 seconds, which is quite close to the
> > differential you observed when timing from the command line.
> >
> > On Aug 30, 1:38 pm, Robert McIntyre  wrote:
> >> I don't know what the heck is going here, but ignore the time the
> >> program is reporting and just
> >> pay attention to how long it actually takes wall-clock style and
> >> you'll see that your clojure and
> >> java programs already take the same time.
> >>
> >> Here are my findings:
> >>
> >> I saved Iterate.java into my rlm package and ran:
> >> time java -server rlm.Iterate
> >>
> >> results:
> >> time java -server rlm.Iterate
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65341 nls in 27 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65310 nls in 27 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 66026 nls in 21 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65473 nls in 19 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65679 nls in 19 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65739 nls in 19 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65310 nls in 21 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65810 nls in 18 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65531 nls in 21 msec
> >> Wanted 16777216 got 16777216 bytes
> >> counted 65418 nls in 21 msec
> >>
> >> real0m27.469s
> >> user0m0.472s
> >> sys 0m26.638s
> >>
> >> I wrapped the last bunch of commands in your clojure script into a
> >> (run) function:
> >> (defn run []
> >>   (let [ifs (FileInputStream. "/dev/urandom")
> >> buf (make-array Byte/TYPE *numbytes*)]
> >> (dotimes [_ 10]
> >>   (let [sz (.read ifs buf)]
> >> (println "Wanted" *numbytes* "got" sz "bytes")
> >> (let [count (time (countnl buf))]
> >>   (println "Got" count "nls"))
> >>
> >> and ran
> >> (time (run)) at the repl:
> >>
> >> (time (run))
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.081975 msecs"
> >> Got 65894 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.001814 msecs"
> >> Got 65949 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.061934 msecs"
> >> Got 65603 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.031131 msecs"
> >> Got 65563 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.122567 msecs"
> >> Got 65696 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 182.968066 msecs"
> >> Got 65546 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.058508 msecs"
> >> Got 65468 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 182.932395 msecs"
> >> Got 65872 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 183.074646 msecs"
> >> Got 65498 nls
> >> Wanted 16777216 got 16777216 bytes
> >> "Elapsed time: 187.733636 msecs"
> >> Got 65434 nls
> >> "Elapsed time: 28510.331507 msecs"
> >> nil
> >>
> >> Total running time for both programs is around 28 seconds.
> >> The java program seems to be incorrectly reporting it's time.
> >>
> >> --Robert McIntyre
> >>
> >> On Mon, Aug 30, 2010 at 4:03 PM, tsuraan  wrote:
> >> > Just to try

Re: Help speed up an inner loop?

2010-08-30 Thread Robert McIntyre
Ah, I see that I was mistaken about the timing. Sorry about that.

After a lot of fiddling around, I cam up with this faster form:

(defn countnl-lite
  [#^bytes buf]
  (areduce buf idx count (int 0)
   (if (= (clojure.lang.RT/aget buf idx) 10)
 (unchecked-add count 1)
 count)))

Key points are initializing count to a primitive integer and directly
calling clojure's aget to avoid an unnecessary integer cast.

On my system:

The unmodified countnl function takes ~ 180 msecs

Without AOT compilation countnl-lite takes around 66 msecs

With AOT compilation countnl-lite takes ~46 msecs

The java method takes ~19 msecs.

I've lost a factor of 2.25 somewhere and it makes me sad that I can't find it.
I would be very interested if anyone could improve countnl-lite.

--Robert McIntyre



On Mon, Aug 30, 2010 at 8:41 PM, Alan  wrote:
> I think this misses the point. Of course java, c, and clojure will all
> have roughly the same wall-clock time for this program, since it is
> dominated by the I/O. You can even see that in the output from $ time
> java Iterate: less than 0.5s was spent in user space, the rest was
> spent in system code - that is, mostly doing I/O.
>
> The java version is a second faster as counted by the wall clock, and
> this is unlikely to be a coincidence: tsuraan's timing data suggests
> that the clojure program takes 80ms longer in each loop, and loops 10
> times. That comes out to 0.8 seconds, which is quite close to the
> differential you observed when timing from the command line.
>
> On Aug 30, 1:38 pm, Robert McIntyre  wrote:
>> I don't know what the heck is going here, but ignore the time the
>> program is reporting and just
>> pay attention to how long it actually takes wall-clock style and
>> you'll see that your clojure and
>> java programs already take the same time.
>>
>> Here are my findings:
>>
>> I saved Iterate.java into my rlm package and ran:
>> time java -server rlm.Iterate
>>
>> results:
>> time java -server rlm.Iterate
>> Wanted 16777216 got 16777216 bytes
>> counted 65341 nls in 27 msec
>> Wanted 16777216 got 16777216 bytes
>> counted 65310 nls in 27 msec
>> Wanted 16777216 got 16777216 bytes
>> counted 66026 nls in 21 msec
>> Wanted 16777216 got 16777216 bytes
>> counted 65473 nls in 19 msec
>> Wanted 16777216 got 16777216 bytes
>> counted 65679 nls in 19 msec
>> Wanted 16777216 got 16777216 bytes
>> counted 65739 nls in 19 msec
>> Wanted 16777216 got 16777216 bytes
>> counted 65310 nls in 21 msec
>> Wanted 16777216 got 16777216 bytes
>> counted 65810 nls in 18 msec
>> Wanted 16777216 got 16777216 bytes
>> counted 65531 nls in 21 msec
>> Wanted 16777216 got 16777216 bytes
>> counted 65418 nls in 21 msec
>>
>> real    0m27.469s
>> user    0m0.472s
>> sys     0m26.638s
>>
>> I wrapped the last bunch of commands in your clojure script into a
>> (run) function:
>> (defn run []
>>   (let [ifs (FileInputStream. "/dev/urandom")
>>         buf (make-array Byte/TYPE *numbytes*)]
>>     (dotimes [_ 10]
>>       (let [sz (.read ifs buf)]
>>         (println "Wanted" *numbytes* "got" sz "bytes")
>>         (let [count (time (countnl buf))]
>>           (println "Got" count "nls"))
>>
>> and ran
>> (time (run)) at the repl:
>>
>> (time (run))
>> Wanted 16777216 got 16777216 bytes
>> "Elapsed time: 183.081975 msecs"
>> Got 65894 nls
>> Wanted 16777216 got 16777216 bytes
>> "Elapsed time: 183.001814 msecs"
>> Got 65949 nls
>> Wanted 16777216 got 16777216 bytes
>> "Elapsed time: 183.061934 msecs"
>> Got 65603 nls
>> Wanted 16777216 got 16777216 bytes
>> "Elapsed time: 183.031131 msecs"
>> Got 65563 nls
>> Wanted 16777216 got 16777216 bytes
>> "Elapsed time: 183.122567 msecs"
>> Got 65696 nls
>> Wanted 16777216 got 16777216 bytes
>> "Elapsed time: 182.968066 msecs"
>> Got 65546 nls
>> Wanted 16777216 got 16777216 bytes
>> "Elapsed time: 183.058508 msecs"
>> Got 65468 nls
>> Wanted 16777216 got 16777216 bytes
>> "Elapsed time: 182.932395 msecs"
>> Got 65872 nls
>> Wanted 16777216 got 16777216 bytes
>> "Elapsed time: 183.074646 msecs"
>> Got 65498 nls
>> Wanted 16777216 got 16777216 bytes
>> "Elapsed time: 187.733636 msecs"
>> Got 65434 nls
>> "Elapsed time: 28510.331507 msecs"
>> nil
>>
>> Total running time for both programs is around 28 seconds.
>> The java program seems to be incorrectly reporting it's time.
>>
>> --Robert McIntyre
>>
>> On Mon, Aug 30, 2010 at 4:03 PM, tsuraan  wrote:
>> > Just to try to see if clojure is a practical language for doing
>> > byte-level work (parsing files, network streams, etc), I wrote a
>> > trivial function to iterate through a buffer of bytes and count all
>> > the newlines that it sees.  For my testing, I've written a C version,
>> > a Java version, and a Clojure version.  I'm running each routine 10
>> > times over a 16MB buffer read from /dev/urandom (the buffer is
>> > refreshed between each call to the newline counting function).  With
>> > gcc -O0, I get about 80ms per 16MB buffer.  With gcc -O3, I get ~14ms
>

Re: Help speed up an inner loop?

2010-08-30 Thread Alan
I think this misses the point. Of course java, c, and clojure will all
have roughly the same wall-clock time for this program, since it is
dominated by the I/O. You can even see that in the output from $ time
java Iterate: less than 0.5s was spent in user space, the rest was
spent in system code - that is, mostly doing I/O.

The java version is a second faster as counted by the wall clock, and
this is unlikely to be a coincidence: tsuraan's timing data suggests
that the clojure program takes 80ms longer in each loop, and loops 10
times. That comes out to 0.8 seconds, which is quite close to the
differential you observed when timing from the command line.

On Aug 30, 1:38 pm, Robert McIntyre  wrote:
> I don't know what the heck is going here, but ignore the time the
> program is reporting and just
> pay attention to how long it actually takes wall-clock style and
> you'll see that your clojure and
> java programs already take the same time.
>
> Here are my findings:
>
> I saved Iterate.java into my rlm package and ran:
> time java -server rlm.Iterate
>
> results:
> time java -server rlm.Iterate
> Wanted 16777216 got 16777216 bytes
> counted 65341 nls in 27 msec
> Wanted 16777216 got 16777216 bytes
> counted 65310 nls in 27 msec
> Wanted 16777216 got 16777216 bytes
> counted 66026 nls in 21 msec
> Wanted 16777216 got 16777216 bytes
> counted 65473 nls in 19 msec
> Wanted 16777216 got 16777216 bytes
> counted 65679 nls in 19 msec
> Wanted 16777216 got 16777216 bytes
> counted 65739 nls in 19 msec
> Wanted 16777216 got 16777216 bytes
> counted 65310 nls in 21 msec
> Wanted 16777216 got 16777216 bytes
> counted 65810 nls in 18 msec
> Wanted 16777216 got 16777216 bytes
> counted 65531 nls in 21 msec
> Wanted 16777216 got 16777216 bytes
> counted 65418 nls in 21 msec
>
> real    0m27.469s
> user    0m0.472s
> sys     0m26.638s
>
> I wrapped the last bunch of commands in your clojure script into a
> (run) function:
> (defn run []
>   (let [ifs (FileInputStream. "/dev/urandom")
>         buf (make-array Byte/TYPE *numbytes*)]
>     (dotimes [_ 10]
>       (let [sz (.read ifs buf)]
>         (println "Wanted" *numbytes* "got" sz "bytes")
>         (let [count (time (countnl buf))]
>           (println "Got" count "nls"))
>
> and ran
> (time (run)) at the repl:
>
> (time (run))
> Wanted 16777216 got 16777216 bytes
> "Elapsed time: 183.081975 msecs"
> Got 65894 nls
> Wanted 16777216 got 16777216 bytes
> "Elapsed time: 183.001814 msecs"
> Got 65949 nls
> Wanted 16777216 got 16777216 bytes
> "Elapsed time: 183.061934 msecs"
> Got 65603 nls
> Wanted 16777216 got 16777216 bytes
> "Elapsed time: 183.031131 msecs"
> Got 65563 nls
> Wanted 16777216 got 16777216 bytes
> "Elapsed time: 183.122567 msecs"
> Got 65696 nls
> Wanted 16777216 got 16777216 bytes
> "Elapsed time: 182.968066 msecs"
> Got 65546 nls
> Wanted 16777216 got 16777216 bytes
> "Elapsed time: 183.058508 msecs"
> Got 65468 nls
> Wanted 16777216 got 16777216 bytes
> "Elapsed time: 182.932395 msecs"
> Got 65872 nls
> Wanted 16777216 got 16777216 bytes
> "Elapsed time: 183.074646 msecs"
> Got 65498 nls
> Wanted 16777216 got 16777216 bytes
> "Elapsed time: 187.733636 msecs"
> Got 65434 nls
> "Elapsed time: 28510.331507 msecs"
> nil
>
> Total running time for both programs is around 28 seconds.
> The java program seems to be incorrectly reporting it's time.
>
> --Robert McIntyre
>
> On Mon, Aug 30, 2010 at 4:03 PM, tsuraan  wrote:
> > Just to try to see if clojure is a practical language for doing
> > byte-level work (parsing files, network streams, etc), I wrote a
> > trivial function to iterate through a buffer of bytes and count all
> > the newlines that it sees.  For my testing, I've written a C version,
> > a Java version, and a Clojure version.  I'm running each routine 10
> > times over a 16MB buffer read from /dev/urandom (the buffer is
> > refreshed between each call to the newline counting function).  With
> > gcc -O0, I get about 80ms per 16MB buffer.  With gcc -O3, I get ~14ms
> > per buffer.  With javac (and java -server) I get 20ms per 16MB buffer.
> >  With clojure, I get 105ms per buffer (after the jvm warms up).  I'm
> > guessing that the huge boost that java and gcc -O3 get is from
> > converting per-byte operations to per-int ops; at least that ~4x boost
> > looks like it would come from something like that.  Is that an
> > optimization that is unavailable to clojure?  The java_interop doc
> > makes it sound like java and clojure get the exact same bytecode when
> > using areduce correctly, so maybe there's something I could be doing
> > better.  Here are my small programs; if somebody could suggest
> > improvements, I'd appreciate them.
>
> > iterate.clj:
>
> > (set! *warn-on-reflection* true)
> > (import java.io.FileInputStream)
>
> > (def *numbytes* (* 16 1024 1024))
>
> > (defn countnl
> >  [#^bytes buf]
> >  (let [nl (byte 10)]
> >    (areduce buf idx count 0
> >             (if (= (aget buf idx) nl)
> >               (inc count)
> >      

Re: Help speed up an inner loop?

2010-08-30 Thread Robert McIntyre
I don't know what the heck is going here, but ignore the time the
program is reporting and just
pay attention to how long it actually takes wall-clock style and
you'll see that your clojure and
java programs already take the same time.

Here are my findings:

I saved Iterate.java into my rlm package and ran:
time java -server rlm.Iterate

results:
time java -server rlm.Iterate
Wanted 16777216 got 16777216 bytes
counted 65341 nls in 27 msec
Wanted 16777216 got 16777216 bytes
counted 65310 nls in 27 msec
Wanted 16777216 got 16777216 bytes
counted 66026 nls in 21 msec
Wanted 16777216 got 16777216 bytes
counted 65473 nls in 19 msec
Wanted 16777216 got 16777216 bytes
counted 65679 nls in 19 msec
Wanted 16777216 got 16777216 bytes
counted 65739 nls in 19 msec
Wanted 16777216 got 16777216 bytes
counted 65310 nls in 21 msec
Wanted 16777216 got 16777216 bytes
counted 65810 nls in 18 msec
Wanted 16777216 got 16777216 bytes
counted 65531 nls in 21 msec
Wanted 16777216 got 16777216 bytes
counted 65418 nls in 21 msec

real0m27.469s
user0m0.472s
sys 0m26.638s


I wrapped the last bunch of commands in your clojure script into a
(run) function:
(defn run []
  (let [ifs (FileInputStream. "/dev/urandom")
buf (make-array Byte/TYPE *numbytes*)]
(dotimes [_ 10]
  (let [sz (.read ifs buf)]
(println "Wanted" *numbytes* "got" sz "bytes")
(let [count (time (countnl buf))]
  (println "Got" count "nls"))

and ran
(time (run)) at the repl:

(time (run))
Wanted 16777216 got 16777216 bytes
"Elapsed time: 183.081975 msecs"
Got 65894 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 183.001814 msecs"
Got 65949 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 183.061934 msecs"
Got 65603 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 183.031131 msecs"
Got 65563 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 183.122567 msecs"
Got 65696 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 182.968066 msecs"
Got 65546 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 183.058508 msecs"
Got 65468 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 182.932395 msecs"
Got 65872 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 183.074646 msecs"
Got 65498 nls
Wanted 16777216 got 16777216 bytes
"Elapsed time: 187.733636 msecs"
Got 65434 nls
"Elapsed time: 28510.331507 msecs"
nil

Total running time for both programs is around 28 seconds.
The java program seems to be incorrectly reporting it's time.

--Robert McIntyre









On Mon, Aug 30, 2010 at 4:03 PM, tsuraan  wrote:
> Just to try to see if clojure is a practical language for doing
> byte-level work (parsing files, network streams, etc), I wrote a
> trivial function to iterate through a buffer of bytes and count all
> the newlines that it sees.  For my testing, I've written a C version,
> a Java version, and a Clojure version.  I'm running each routine 10
> times over a 16MB buffer read from /dev/urandom (the buffer is
> refreshed between each call to the newline counting function).  With
> gcc -O0, I get about 80ms per 16MB buffer.  With gcc -O3, I get ~14ms
> per buffer.  With javac (and java -server) I get 20ms per 16MB buffer.
>  With clojure, I get 105ms per buffer (after the jvm warms up).  I'm
> guessing that the huge boost that java and gcc -O3 get is from
> converting per-byte operations to per-int ops; at least that ~4x boost
> looks like it would come from something like that.  Is that an
> optimization that is unavailable to clojure?  The java_interop doc
> makes it sound like java and clojure get the exact same bytecode when
> using areduce correctly, so maybe there's something I could be doing
> better.  Here are my small programs; if somebody could suggest
> improvements, I'd appreciate them.
>
> iterate.clj:
>
> (set! *warn-on-reflection* true)
> (import java.io.FileInputStream)
>
> (def *numbytes* (* 16 1024 1024))
>
> (defn countnl
>  [#^bytes buf]
>  (let [nl (byte 10)]
>    (areduce buf idx count 0
>             (if (= (aget buf idx) nl)
>               (inc count)
>               count
>
> (let [ifs (FileInputStream. "/dev/urandom")
>      buf (make-array Byte/TYPE *numbytes*)]
>  (dotimes [_ 10]
>    (let [sz (.read ifs buf)]
>      (println "Wanted" *numbytes* "got" sz "bytes")
>      (let [count (time (countnl buf))]
>        (println "Got" count "nls")
>
>
> Iterate.java:
>
> import java.io.FileInputStream;
>
> class Iterate
> {
>  static final int NUMBYTES = 16*1024*1024;
>
>  static int countnl(byte[] buf)
>  {
>    int count = 0;
>    for(int i = 0; i < buf.length; i++) {
>      if(buf[i] == '\n') {
>        count++;
>      }
>    }
>    return count;
>  }
>
>  public static final void main(String[] args)
>    throws Throwable
>  {
>    FileInputStream input = new FileInputStream("/dev/urandom");
>    byte[] buf = new byte[NUMBYTES];
>    int sz;
>    long start, end;
>
>    for(int i = 0; i < 10; i++) {
>      sz = input.read(buf);
>      System.out.println("Wanted " +