Newbie question about vector performance

2010-05-28 Thread Rubén Béjar

Hi all,

I am new to the list and to Clojure. I have been working in
implementing some 2D cellular automata (CA) just to have a project
to teach Clojure to myself. After some work I have something that
works, but it is pretty slow. The function that takes a CA
of 500x500 cells (integers) and returns an updated (*) copy
takes 4 s. (using Clojure vectors), while doing more or less 
the same in Java (using arrays and primitive types) takes more or

less 16 *ms.*. I expected some difference, but not that big. Before
trying to use Java arrays and primitive types
in Clojure, I would like to know how my Clojure approach can
be improved  (I am willing to sacrifice some performance to keep
it "more Clojure", but not that much).

As I do not want to post a bunch of horrible code full of comments
and not properly indented, I have extracted what i hope are the
main pieces, written some comments and posted it here:
http://snipt.org/Okpk

Pasting that code in a new file in Eclipse (I am using
counterclockwise) and running it in the REPL prints this:

Clojure 1.1.0-alpha-SNAPSHOT
"Elapsed time: 4355.363706 msecs"
"Elapsed time: 4416.98562 msecs"
1:1 user=> #
1:2 cellular-automata-basic=>

I would thank a lot any hint, suggestion, comment, or
whatever... :-)

Best regards,

   Rubén

(*) The update consists on adding the values of the 8 neighbours
of every cell and changing it if that sum is between two fixed
numbers.

--
Rubén BÉJAR HERNÁNDEZ

Dpto. de Informática e Ingeniería de Sistemas - Universidad de Zaragoza
(Computing and Systems Engineering Department - Universidad de Zaragoza)
c/ María de Luna 1, 50018 Zaragoza, Spain

Tel: (+34) 976 76 2332 (Fax: 1914)  
e-mail: rbe...@unizar.es


Grupo IA3 (IA3 Laboratory) - http://iaaa.cps.unizar.es



--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Newbie question about vector performance

2010-05-28 Thread Rubén Béjar




Hi again,

I have tried a few more things:

I have done the same in Java but using an array 
of Integers, instead of an array of ints. It just takes 
78 ms. (more than 16, still far less than 4 secs).

I have also tried with an update function which just returns a 
fixed 1. It still takes some 400 ms. to update de
data vector of the CA. This is similar to the time 
it is giving me to create a vector of 1's with this function:
(time (vec (repeat 25 1))).

I was not expecting Clojure vectors so much slower than
Java arrays. Is that comparison so "unfair"? Next thing
I am trying is using Java Vectors in my Java 
implementation...

   Rubén

Rubén Béjar escribió:
Hi all,
  
  
I am new to the list and to Clojure. I have been working in
  
implementing some 2D cellular automata (CA) just to have a project
  
to teach Clojure to myself. After some work I have something that
  
works, but it is pretty slow. The function that takes a CA
  
of 500x500 cells (integers) and returns an updated (*) copy
  
takes 4 s. (using Clojure vectors), while doing more or less the same
in Java (using arrays and primitive types) takes more or
  
less 16 *ms.*. I expected some difference, but not that big. Before
  
trying to use Java arrays and primitive types
  
in Clojure, I would like to know how my Clojure approach can
  
be improved  (I am willing to sacrifice some performance to keep
  
it "more Clojure", but not that much).
  
  
As I do not want to post a bunch of horrible code full of comments
  
and not properly indented, I have extracted what i hope are the
  
main pieces, written some comments and posted it here:
  
http://snipt.org/Okpk
  
  
Pasting that code in a new file in Eclipse (I am using
  
counterclockwise) and running it in the REPL prints this:
  
  
Clojure 1.1.0-alpha-SNAPSHOT
  
"Elapsed time: 4355.363706 msecs"
  
"Elapsed time: 4416.98562 msecs"
  
1:1 user=> #
  
1:2 cellular-automata-basic=>
  
  
I would thank a lot any hint, suggestion, comment, or
  
whatever... :-)
  
  
Best regards,
  
  
   Rubén
  
  
(*) The update consists on adding the values of the 8 neighbours
  
of every cell and changing it if that sum is between two fixed
  
numbers.
  
  



-- 
Rubén BÉJAR HERNÁNDEZ

Dpto. de Informática e Ingeniería de Sistemas - Universidad de Zaragoza
(Computing and Systems Engineering Department - Universidad de Zaragoza)
c/ María de Luna 1, 50018 Zaragoza, Spain

Tel: (+34) 976 76 2332 (Fax: 1914)  
e-mail: rbe...@unizar.es

Grupo IA3 (IA3 Laboratory) - http://iaaa.cps.unizar.es








-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Newbie question about vector performance

2010-05-28 Thread Michael Gardner
On May 28, 2010, at 8:47 AM, Rubén Béjar wrote:

> I would thank a lot any hint, suggestion, comment, or
> whatever... :-)

As a style issue I'd suggest using inc, dec, neg?, pos?, and zero? instead of 
the various (+ x 1), (< x 0), etc. in your code. This actually seems to improve 
performance a bit on my laptop, but it's nothing amazing. To get good 
performance you're likely going to need to do some type hinting.

http://clojure.org/java_interop#Java%20Interop-Type%20Hints

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Newbie question about vector performance

2010-05-28 Thread Luke VanderHart
I can't see your code due to the IT policies here, but I can make some
generalizations - these are assuming your code is correct and you're
not accidentally using an exponential algorithm (which I wouldn't
preclude, 4 minutes does sound truly excessively slow, even for
vectors).

Vectors are significantly slower than Java arrays due to their copy-on-
write semantics.

You have a few options, both of which are considered perfectly
acceptable Clojure for high-performance numerical code:

1. Use transients (http://clojure.org/transients) when you update your
vectors. This should give you a pretty significant speed increase.
2. Alternatively, use native java arrays. Clojure provides a complete
set of functions for creating and mutating raw Java arrays
3. Use type hints to eliminate reflection at choke points in your
processing.

If you do 2 and 3, you should get pretty close to native Java speeds.


On May 28, 12:20 pm, Rubén Béjar  wrote:
> Hi again,
> I have tried a few more things:
> I have done the same in Java but using an array
> of Integers, instead of an array of ints. It just takes
> 78 ms. (more than 16, still far less than 4 secs).
> I have also tried with an update function which just returns a
> fixed 1. It still takes some 400 ms. to update de
> data vector of the CA. This is similar to the time
> it is giving me to create a vector of 1's with this function:
> (time (vec (repeat 25 1))).
> I was not expecting Clojure vectors so much slower than
> Java arrays. Is that comparison so "unfair"? Next thing
> I am trying is using Java Vectors in my Java
> implementation...
>Rubén
> Rubén Béjar escribió:Hi all,
> I am new to the list and to Clojure. I have been working in
> implementing some 2D cellular automata (CA) just to have a project
> to teach Clojure to myself. After some work I have something that
> works, but it is pretty slow. The function that takes a CA
> of 500x500 cells (integers) and returns an updated (*) copy
> takes 4 s. (using Clojure vectors), while doing more or less the same in Java 
> (using arrays and primitive types) takes more or
> less 16 *ms.*. I expected some difference, but not that big. Before
> trying to use Java arrays and primitive types
> in Clojure, I would like to know how my Clojure approach can
> be improved  (I am willing to sacrifice some performance to keep
> it "more Clojure", but not that much).
> As I do not want to post a bunch of horrible code full of comments
> and not properly indented, I have extracted what i hope are the
> main pieces, written some comments and posted it here:http://snipt.org/Okpk
> Pasting that code in a new file in Eclipse (I am using
> counterclockwise) and running it in the REPL prints this:
> Clojure 1.1.0-alpha-SNAPSHOT
> "Elapsed time: 4355.363706 msecs"
> "Elapsed time: 4416.98562 msecs"
> 1:1 user=> #
> 1:2 cellular-automata-basic=>
> I would thank a lot any hint, suggestion, comment, or
> whatever... :-)
> Best regards,
>Rubén
> (*) The update consists on adding the values of the 8 neighbours
> of every cell and changing it if that sum is between two fixed
> numbers.-- Rubén BÉJAR HERNÁNDEZ Dpto. de Informática e Ingeniería de 
> Sistemas - Universidad de Zaragoza (Computing and Systems Engineering 
> Department - Universidad de Zaragoza) c/ María de Luna 1, 50018 Zaragoza, 
> Spain Tel: (+34) 976 76 2332 (Fax: 1914) e-mail:rbe...@unizar.esgrupo IA3 
> (IA3 Laboratory) -http://iaaa.cps.unizar.es

On May 28, 12:20 pm, Rubén Béjar  wrote:
> Hi again,
> I have tried a few more things:
> I have done the same in Java but using an array
> of Integers, instead of an array of ints. It just takes
> 78 ms. (more than 16, still far less than 4 secs).
> I have also tried with an update function which just returns a
> fixed 1. It still takes some 400 ms. to update de
> data vector of the CA. This is similar to the time
> it is giving me to create a vector of 1's with this function:
> (time (vec (repeat 25 1))).
> I was not expecting Clojure vectors so much slower than
> Java arrays. Is that comparison so "unfair"? Next thing
> I am trying is using Java Vectors in my Java
> implementation...
>    Rubén
> Rubén Béjar escribió:Hi all,
> I am new to the list and to Clojure. I have been working in
> implementing some 2D cellular automata (CA) just to have a project
> to teach Clojure to myself. After some work I have something that
> works, but it is pretty slow. The function that takes a CA
> of 500x500 cells (integers) and returns an updated (*) copy
> takes 4 s. (using Clojure vectors), while doing more or less the same in Java 
> (using arrays and primitive types) takes more or
> less 16 *ms.*. I expected some difference, but not that big. Before
> trying to use Java arrays and primitive types
> in Clojure, I would like to know how my Clojure approach can
> be improved  (I am willing to sacrifice some performance to keep
> it "more Clojure", but not that much).
> As I do not want to post a bunch of horrible code full of com

Re: Newbie question about vector performance

2010-05-29 Thread igorrumiha
On May 28, 3:47 pm, Rubén Béjar  wrote:
>
> I would thank a lot any hint, suggestion, comment, or
> whatever... :-)
>

Here is a version that is approx 10 times faster: http://snipt.org/Olml
The code structure is basically the same but it uses integer arrays
for storage, some manual inlining and lots of type casts. I'm not
certain it _works_ correctly, but... :)
My observations:
- use the profiler. jvisualvm that comes with the sun JDK has received
some much needed love recently (especially in java 1.6.0_20) and has
helped me alot in profiling the code. Install an additional plugin:
Sampler. It is has a much lower overhead compared to the Profiler
plugin that is installed by default. Both are useful, though.
- native clojure data structures (seqs, vectors, maps, etc.) are slow
(compared to native Java arrays for instance). There is no going
around that. Immutability and other very nice features come with a
price. Dynamic languages are also, by their nature, slower than their
static companions.
- jvisualvm shows that in my version 40% - 50% of the time is spent in
the ur-8neigh-minmax function. That time is further split into array
fetches, casts to int, arithmetic (inc, dec, add, multiply,
comparisons). I simply don't know how to optimise that further without
changing the algorithm. On that note, the Clojure compiler and core
libraries are getting better: http://snipt.org/Olmn Current Clojure
1.2.0 gives approx 10% better results.

I assume (am certain, actually :)) my "improvements" can be further
improved, but the code gets quite unreadable bery fast.

A side note: I have been happily programing in Perl for many years
now. Perl is typically 50x to 100x slower than C or Java. When people
identify a performance bottleneck someone writes a version of the same
thing in C. I don't see a problem with using the same approach in
Clojure. You can write some quite ugly clojure (usually using
mutability) that approaches Java speed or you can abstract the slow
parts out and write them in Java.

Hope this helps... :)

--
Igor Rumiha

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Newbie question about vector performance

2010-05-29 Thread Laurent PETIT
If you change the size of your CA, is the ( java time / clojure time ratio )
roughly constant ?

2010/5/28 Rubén Béjar 

> Hi all,
>
> I am new to the list and to Clojure. I have been working in
> implementing some 2D cellular automata (CA) just to have a project
> to teach Clojure to myself. After some work I have something that
> works, but it is pretty slow. The function that takes a CA
> of 500x500 cells (integers) and returns an updated (*) copy
> takes 4 s. (using Clojure vectors), while doing more or less the same in
> Java (using arrays and primitive types) takes more or
> less 16 *ms.*. I expected some difference, but not that big. Before
> trying to use Java arrays and primitive types
> in Clojure, I would like to know how my Clojure approach can
> be improved  (I am willing to sacrifice some performance to keep
> it "more Clojure", but not that much).
>
> As I do not want to post a bunch of horrible code full of comments
> and not properly indented, I have extracted what i hope are the
> main pieces, written some comments and posted it here:
> http://snipt.org/Okpk
>
> Pasting that code in a new file in Eclipse (I am using
> counterclockwise) and running it in the REPL prints this:
>
> Clojure 1.1.0-alpha-SNAPSHOT
> "Elapsed time: 4355.363706 msecs"
> "Elapsed time: 4416.98562 msecs"
> 1:1 user=> #
> 1:2 cellular-automata-basic=>
>
> I would thank a lot any hint, suggestion, comment, or
> whatever... :-)
>
> Best regards,
>
>   Rubén
>
> (*) The update consists on adding the values of the 8 neighbours
> of every cell and changing it if that sum is between two fixed
> numbers.
>
> --
> Rubén BÉJAR HERNÁNDEZ
>
> Dpto. de Informática e Ingeniería de Sistemas - Universidad de Zaragoza
> (Computing and Systems Engineering Department - Universidad de Zaragoza)
> c/ María de Luna 1, 50018 Zaragoza, Spain
>
> Tel: (+34) 976 76 2332 (Fax: 1914)  e-mail: rbe...@unizar.es
>
> Grupo IA3 (IA3 Laboratory) - http://iaaa.cps.unizar.es
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Newbie question about vector performance

2010-05-30 Thread Rubén Béjar

Michael Gardner wrote:

As a style issue I'd suggest using inc, dec, neg?, pos?, and zero? instead of the 
various (+ x 1), (< x 0), etc. in your code. This actually seems to improve 
performance a bit on my laptop, but it's nothing amazing. To get good performance 
you're likely going to need to do some type hinting.
  
Thanks for the style suggestion, this is something I will use. I have 
been playing a little with type hinting without much improvement. It is 
perfectly possible I have done it wrong, but my algorithm is not that 
intensive in integer arithmetic. My Java version with Integers instead 
of ints (still using arrays) is some 4 times slower than the Java 
version with ints and arrays, but still some 40 times faster than my 
original Clojure attempt.


Luke VanderHart wrote:


I can't see your code due to the IT policies here, but I can make some
generalizations - these are assuming your code is correct and you're
not accidentally using an exponential algorithm (which I wouldn't
preclude, 4 minutes does sound truly excessively slow, even for
vectors).
  
Well, actually it is 4 seconds, if it was 4 minutes I would have been 
too embarrassed even to

ask for help ;-)

Vectors are significantly slower than Java arrays due to their copy-on-
write semantics.

You have a few options, both of which are considered perfectly
acceptable Clojure for high-performance numerical code:

1. Use transients (http://clojure.org/transients) when you update your
vectors. This should give you a pretty significant speed increase.
2. Alternatively, use native java arrays. Clojure provides a complete
set of functions for creating and mutating raw Java arrays
3. Use type hints to eliminate reflection at choke points in your
processing.
It seems 2 and 3 must be in my to-learn list inmediately. Transients 
look like a little bit

"advanced Clojure" for me now... :-)
Thank you.

Igor Rumiha wrote:

Here is a version that is approx 10 times faster: http://snipt.org/Olml
The code structure is basically the same but it uses integer arrays
for storage, some manual inlining and lots of type casts. I'm not
certain it _works_ correctly, but...  :) 
  
You did not have to do that, a simple "try with native Java arrays" 
would have been
enough for me to do my homework. Any way, I truly appreciate your 
effort... :-)

My observations:
- use the profiler. jvisualvm that comes with the sun JDK has received
some much needed love recently (especially in java 1.6.0_20) and has
helped me alot in profiling the code. Install an additional plugin:
Sampler. It is has a much lower overhead compared to the Profiler
plugin that is installed by default. Both are useful, though.
- native clojure data structures (seqs, vectors, maps, etc.) are slow
(compared to native Java arrays for instance). There is no going
around that. Immutability and other very nice features come with a
price. Dynamic languages are also, by their nature, slower than their
static companions.
- jvisualvm shows that in my version 40% - 50% of the time is spent in
the ur-8neigh-minmax function. That time is further split into array
fetches, casts to int, arithmetic (inc, dec, add, multiply,
comparisons). I simply don't know how to optimise that further without
changing the algorithm. On that note, the Clojure compiler and core
libraries are getting better: http://snipt.org/Olmn Current Clojure
1.2.0 gives approx 10% better results.
  
I have never been a "man of profilers", so I do not have experience with 
them. Maybe another
thing for my to-learn list. Yes, that function is called once per every 
cell in the CA, and reads
the value of every neighoubr of that cell, so it is quite natural it 
takes most of the time. The
truth is that I am a bit surprised it is just 40-50 percent of the time, 
I would have guessed it

was more. I'll have to think about it...

Hope this helps...  :) 
  

It helps, thank you very much :-)

Laurent Petit asked:

If you change the size of your CA, is the ( java time / clojure time 
ratio ) roughly constant ?
Not really sure. For instance, in Java (array of ints) a 2000x2000 CA is 
updated
in some 180 ms. (the 500x500 was 16ms, but I measured it with 
System.currentTimeMillis(), which
is not very precise with that short periods of time) and  in Clojure the 
2000x2000 CA
is updated in 98 secs some times and up to 150 s. other times (the 
500x500 was 4 secs),
but I have just measured it in my home laptop, so it is really not fair 
to compare with my
original measurement at my work desktop. I will make some more tests 
tomorrow if I can,
bur for now I would say that in the worst case my Clojure algorithm may 
be slightlly worse than
the one I have implemented in Java (that it is the basic one you would 
apply after

taking "Programming 101").

Thanks to all of you for your interest, I will inform of any further 
advance...

:-)

   Rubén


--
Rubén BÉJAR HERNÁNDEZ

Dpto. de Informática e Ingeniería de Sistemas - Universidad de Zaragoza
(Computing and S

Re: Newbie question about vector performance

2010-05-31 Thread David Nolen
On Fri, May 28, 2010 at 12:20 PM, Rubén Béjar  wrote:

>  Hi again,
>
> I have tried a few more things:
>
> I have done the same in Java but using an array
> of Integers, instead of an array of ints. It just takes
> 78 ms. (more than 16, still far less than 4 secs).
>
> I have also tried with an update function which just returns a
> fixed 1. It still takes some 400 ms. to update de
> data vector of the CA. This is similar to the time
> it is giving me to create a vector of 1's with this function:
> (time (vec (repeat 25 1))).
>
> I was not expecting Clojure vectors so much slower than
> Java arrays. Is that comparison so "unfair"? Next thing
> I am trying is using Java Vectors in my Java
> implementation...
>
>Rubén
>

I looked into your code and came up with this: http://gist.github.com/420036

On my machine 500x500 takes about 60-70ms. 2000x2000 takes about ~940ms.

Also, Clojure vectors aren't slow:

(defn vec-repeat [n x]
(loop [n n v (transient [])]
  (if (zero? n)
(persistent! v)
(recur (dec n) (conj! v x)

(vec-repeat 25 1)

Takes 4-7ms

(vec-repeat 250 1)

Takes 50-60ms

Transients aren't an advanced topic, they solve the problem of building up
large vectors in a memory/cpu efficient manner.

In very tight loops over primitive data:

1. Spending time accessing Clojure data structures can hurt you
2. Spending time creating temporary data structures can hurt you
3. Inline the most called bits.

Fast Clojure code that works with arrays need not look ugly (checkout amap
and areduce). Also macros can alleviate the tedious typing of type hints.

David

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Newbie question about vector performance

2010-05-31 Thread David Nolen
On Sun, May 30, 2010 at 3:49 PM, Rubén Béjar  wrote:
>
> is not very precise with that short periods of time) and  in Clojure the
> 2000x2000 CA
> is updated in 98 secs some times and up to 150 s. other times (the 500x500
> was 4 secs),


When numbers fluctuate like that I'm pretty sure you're hitting GC. Another
reason to use transients if you're going to use vectors to store a lot of
data.

David

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Newbie question about vector performance

2010-06-11 Thread David Nolen
On Sun, May 30, 2010 at 3:49 PM, Rubén Béjar  wrote:

> Thanks to all of you for your interest, I will inform of any further
> advance...
> :-)
>
>   Rubén


Because I like sitting around optimizing Clojure code I looked into this
some more:

http://gist.github.com/420036

With the really great changes in the 1.2, you can see there's a lovely lack
of manual type hinting here. No macros required either. On my machine this
averages around ~30ms for 500x500. Not too shabby. 2000x2000 takes ~480ms or
so.

David

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en