Re: Clojure performance question

2014-03-05 Thread Jarrod Swart
Thanks for the elaboration, I just wanted to make sure I understood.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-04 Thread Mikera


On Wednesday, 5 March 2014 08:19:26 UTC+8, Jarrod Swart wrote:
>
>
> On Monday, March 3, 2014 11:35:58 AM UTC-5, Mikera wrote:
>>
>>
>> Obviously, there are cases where allocating a sequence will be slower 
>> than iterative techniques. But that's *easy enough to fix by just using 
>> iterations in those cases* use the right tool for the job and all 
>> that. 
>>
>>
> I follow most of this but could you elaborate on what you mean here, I 
> want to be sure I understand what you mean by "iteration" and how you would 
> implement such a thing for a performance increase.  Do you mean loop/recur? 
>

Yes - loop/recur is pretty much the fundamental iterative construct in 
Clojure. From the clojure.org website "recur is the only 
non-stack-consuming looping construct in Clojure."

That doesn't mean you necessarily have to use it directly - various other 
iterative / looping constructs use it under the hood. Lots of macros 
actually expand to some sort of loop/recur construct.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-04 Thread Jarrod Swart

On Monday, March 3, 2014 11:35:58 AM UTC-5, Mikera wrote:
>
>
> Obviously, there are cases where allocating a sequence will be slower than 
> iterative techniques. But that's *easy enough to fix by just using 
> iterations in those cases* use the right tool for the job and all 
> that. 
>
>
I follow most of this but could you elaborate on what you mean here, I want 
to be sure I understand what you mean by "iteration" and how you would 
implement such a thing for a performance increase.  Do you mean loop/recur? 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-03 Thread Jozef Wagner
I was not trying to generalize. I think that ISeq is not the right tool in
case where processing of large collection is a performance bottleneck. It
is perfectly OK for other purposes though.

Cons cell has its next object 'recorded', however this next object does not
have to be Cons, in which case calling next on it will create a new object.

LazySeq (a type) is just a delay with fancy interface, and it just
emphasizes ISeq's perf issues. Moreover, the fact that you usually do not
want to hold onto the head in a lazy seq (a virtual collection [1])
indicates that the caching of 'next' is not what you want when iterating
over large data.

The performance issues in ISeq led to the design of chunked seqs [2] and
reducers, both of which try to eliminate this "per-step allocation
overhead" [3]. An expensive collection crunching should not be based just
on the ISeq abstraction.

Best,
Jozef

[1] http://clojure.org/lazy
[2] https://www.assembla.com/spaces/clojure/wiki/Chunked_Seqs
[3]
http://clojure.com/blog/2012/05/08/reducers-a-library-and-model-for-collection-processing.html


On Mon, Mar 3, 2014 at 5:35 PM, Mikera  wrote:

> On Monday, 3 March 2014 18:24:48 UTC+8, Jozef Wagner wrote:
>>
>>
>> On Mon, Mar 3, 2014 at 3:06 AM, Mikera  wrote:
>>
>>> ISeq itself isn't too bad (it's just an interface, as above), but some
>>> of the implementations are a bit expensive.
>>>
>>
>> ISeq is inherently not suited for performance critical code, as next()
>> requires creation of a new object. Even if JVM handles such ephermal
>> instances quite well, it still cannot compete with simple iterations or
>> mutable iterators.
>>
>
> That isn't true in general: cons cells for example already have the next
> object recorded so don't cause an allocation on next(). And lazy seqs only
> cause an allocation on the first invocation next(). Given this, ISeq is a
> perfectly decent way to traverse singly linked links, which is in turn a
> good data structure for many use cases. It's probably even optimal in many
> tree-like cases where a lot of structural sharing is possible.
>
> Obviously, there are cases where allocating a sequence will be slower than
> iterative techniques. But that's easy enough to fix by just using
> iterations in those cases use the right tool for the job and all that.
>
> Overall - I think ISeq is perfectly decent for what it does.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/groEven for linked lists, as they are swiftly
> transformed into something more sinister after few map/filter/reduce
> operations.up/clojure?hl=en 
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-03 Thread Ben Mabey

On 3/2/14, 7:06 PM, Mikera wrote:
Some perspectives (as someone who has been tuning this stuff a lot, 
from a core.matrix standpoint in particular)


On Saturday, 1 March 2014 13:02:26 UTC+8, bob wrote:

Hi,

Can I ask a newbie question about clojure performance?

What make clojure performance slow than java?, it seems clojure
has the 1/4 performance compared to java in general, according to
 tests, some cases it might be 1/10. the reasons I can think out are

- the byte code is not efficient sometimes

- the byte code might not enjoy the jvm optimization

Sometimes a problem, though Clojure is not too bad at bytecode 
generation and the JIT will do most of the obvious optimisations for you.


- the reflection

This is extremely bad for performance, but luckily it is easy to avoid:
- Always use *warn-on-reflection*
- Eliminate every single reflection warning with type hints

- the immutable data structure

This is often a performance *advantage*, especially when you start 
dealing with concurrency and data-driven snapshot.


In the few cases where it is a problem, you can always drop back to 
using mutable Java data structures or arrays - so this isn't ever 
really an issue.


- the abstract interface design

This doesn't actually cost that much. Interfaces on the JVM are 
extremely fast and very well optimised. In many cases, JIT 
optimisations make them just as fast as a static method call.



The abstract interface like seq offers its power, but it is easy
to drop in the performance trap.

ISeq itself isn't too bad (it's just an interface, as above), but some 
of the implementations are a bit expensive.


Lazy seqs for example are not so fast... and often you don't need the 
laziness. However most clojure.core functions produce lazy seqs by 
default.


I wrote an "eager-map" replacement for "map" in my clojure-utils 
library to get around this problem


Is this any different than core's mapv fn?  mapv uses transient vectors 
and reduce to eagerly map into a vector.


-Ben

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups "Clojure" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-03 Thread Mikera
On Monday, 3 March 2014 18:24:48 UTC+8, Jozef Wagner wrote:
>
>
> On Mon, Mar 3, 2014 at 3:06 AM, Mikera 
> > wrote:
>
>> ISeq itself isn't too bad (it's just an interface, as above), but some of 
>> the implementations are a bit expensive.
>>
>
> ISeq is inherently not suited for performance critical code, as next() 
> requires creation of a new object. Even if JVM handles such ephermal 
> instances quite well, it still cannot compete with simple iterations or 
> mutable iterators.
>

That isn't true in general: cons cells for example already have the next 
object recorded so don't cause an allocation on next(). And lazy seqs only 
cause an allocation on the first invocation next(). Given this, ISeq is a 
perfectly decent way to traverse singly linked links, which is in turn a 
good data structure for many use cases. It's probably even optimal in many 
tree-like cases where a lot of structural sharing is possible.

Obviously, there are cases where allocating a sequence will be slower than 
iterative techniques. But that's easy enough to fix by just using 
iterations in those cases use the right tool for the job and all that. 

Overall - I think ISeq is perfectly decent for what it does.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-03 Thread Jozef Wagner
On Mon, Mar 3, 2014 at 3:06 AM, Mikera  wrote:

> ISeq itself isn't too bad (it's just an interface, as above), but some of
> the implementations are a bit expensive.
>

ISeq is inherently not suited for performance critical code, as next()
requires creation of a new object. Even if JVM handles such ephermal
instances quite well, it still cannot compete with simple iterations or
mutable iterators.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-02 Thread bob
Cool, Kiss.

>From my 2 cent, the solid is important to the clojure, and the core team is 
making any change carefully, but looks to me that the `radical` idea is 
important to the community and clojure as well, might that we need a branch 
of clojure to do some `radical` attempt and experiment, it is the clojure's 
incubator. 


On Monday, March 3, 2014 10:06:33 AM UTC+8, Mikera wrote:
>
>
>
> 2) Dynamic dispatch: If you look into the details, a lot of Clojure 
> functions have an "Object" argument and end up doing a serious of instance? 
> checks or other methods to achieve dynamic dispatch. This is expensive and 
> unnecessary in many cases, since you can often prove that the argument must 
> be of a specific type (e.g. java.lang.String). This is fixable, but would 
> require smarter type inference in the Clojure compiler itself. Again this 
> is something I'm experimenting with in Kiss, it might also be fixed in a 
> future Clojure-in-Clojure compiler.
>
>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-02 Thread Mikera
Some perspectives (as someone who has been tuning this stuff a lot, from a 
core.matrix standpoint in particular)

On Saturday, 1 March 2014 13:02:26 UTC+8, bob wrote:
>
> Hi,
>
> Can I ask a newbie question about clojure performance?
>
> What make clojure performance slow than java?, it seems clojure has the 
> 1/4 performance compared to java in general, according to  tests, some 
> cases it might be 1/10. the reasons I can think out are 
>
> - the byte code is not efficient sometimes
>
- the byte code might not enjoy the jvm optimization
>
Sometimes a problem, though Clojure is not too bad at bytecode generation 
and the JIT will do most of the obvious optimisations for you.
 

> - the reflection 
>
 
This is extremely bad for performance, but luckily it is easy to avoid:
- Always use *warn-on-reflection*
- Eliminate every single reflection warning with type hints
 

> - the immutable data structure
>
 
This is often a performance *advantage*, especially when you start dealing 
with concurrency and data-driven snapshot.

In the few cases where it is a problem, you can always drop back to using 
mutable Java data structures or arrays - so this isn't ever really an issue.
 

> - the abstract interface design
>
 
This doesn't actually cost that much. Interfaces on the JVM are extremely 
fast and very well optimised. In many cases, JIT optimisations make them 
just as fast as a static method call.
 

>
> The abstract interface like seq offers its power, but it is easy to drop 
> in the performance trap.
>
 
ISeq itself isn't too bad (it's just an interface, as above), but some of 
the implementations are a bit expensive.

Lazy seqs for example are not so fast... and often you don't need the 
laziness. However most clojure.core functions produce lazy seqs by default. 

I wrote an "eager-map" replacement for "map" in my clojure-utils library to 
get around this problem

 

>
> And it seems to me that it is easy to write a slow clojure program, I know 
> the efficiency of code depends on coder, you can write the code faster than 
> java sometimes,but  need to know a lot of deep thing and tricky, and 
> clojure is not the funny clojure any more.
>
>
> Thanks
>
>
There are also a couple of other general issues that are slightly 
problematic performance issues for Clojure that can make it noticeably 
slower than Java for a lot of typical code:

1) Dynamic var lookup - this is expensive because all var accesses need to 
go via a var dereference. This prevents many JVM optimisations, and if 
affects pretty much every non-inlined function call in regular Clojure 
code.  Fixing this would require eliminating the var-based namespace model 
- something I've been experimenting with in my experimental language Kiss, 
which eliminates vars and uses immutable namespaces. This approach looks 
promising, but is a pretty radical change so not sure if it will ever get 
into Clojure itself.

2) Dynamic dispatch: If you look into the details, a lot of Clojure 
functions have an "Object" argument and end up doing a serious of instance? 
checks or other methods to achieve dynamic dispatch. This is expensive and 
unnecessary in many cases, since you can often prove that the argument must 
be of a specific type (e.g. java.lang.String). This is fixable, but would 
require smarter type inference in the Clojure compiler itself. Again this 
is something I'm experimenting with in Kiss, it might also be fixed in a 
future Clojure-in-Clojure compiler.


-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-02 Thread Shantanu Kumar


On Monday, 3 March 2014 02:18:39 UTC+5:30, tbc++ wrote:
>
> How are you running these tests? The "correct" way to benchmark such 
> things is via a real benchmark framework (such as criterium) then compile 
> your clojure app to a jar (perhaps via lein uberjar) and finally run it via 
> a bare java invocation: java -jar my.jar. 
>
> Lein for example sometimes uses sub-par JVM settings, trading runtime 
> performance for startup speed. 
>

Relevant bits from my project.clj are below:

  :dependencies [[org.clojure/clojure "1.5.1"]
 [criterium "0.4.3"]]
  :global-vars {*warn-on-reflection* true
*assert* false
*unchecked-math* true}
  :jvm-opts ^:replace ["-server" "-Xmx1g"]

I believe this overrides Lein's default tiered compilation setting. I 
bench'ed both Java and Clojure code using Criterium.

Shantanu

>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-02 Thread Timothy Baldridge
How are you running these tests? The "correct" way to benchmark such things
is via a real benchmark framework (such as criterium) then compile your
clojure app to a jar (perhaps via lein uberjar) and finally run it via a
bare java invocation: java -jar my.jar.

Lein for example sometimes uses sub-par JVM settings, trading runtime
performance for startup speed.

Timothy


On Sun, Mar 2, 2014 at 4:59 AM, Luc Prefontaine  wrote:

> I cannot agree with this...
> Not at 100% at least.
>
> String manipulations are frequent
> enough to mandate some tuning
> if the need is obvious.
>
> Looks to me that this is the case here.
>
> Other core fns went through rewrites
> to improve performance.
>
> Simplicity has nothing to do with
> internal implementations.
>
> If someone comes up with a better
> implementation while providing
> the same behaviours as the current
> str fn, then it should make it's way
> maybe in clojure.string.
>
> "fast-strings" ? Whatever it may be
> named.
>
> Luc P.
>
> > Core fns should be simple, unsurprising, and general.
> >
> > 'Improving' str may hurt simplicity, make behavior more surprising and
> > unexpected, and less general unless proven otherwise.
> >
> > On Sat, Mar 1, 2014 at 7:02 PM, bob  wrote:
> >
> > >
> > > Good point, Thanks a lot.
> > >
> > > Shall we improve the str fn in the core lib? From my point of view, the
> > > core fns should be performance sensitive.
> > >
> > >
> > >
> > > On Sunday, March 2, 2014 12:03:21 AM UTC+8, Shantanu Kumar wrote:
> > >>
> > >>
> > >>
> > >> On Saturday, 1 March 2014 15:32:41 UTC+5:30, bob wrote:
> > >>>
> > >>> Case :
> > >>>
> > >>> clojure verison:
> > >>>
> > >>> (time (dotimes [n 1000] (str n "another word"))) ;; take about
> > >>> 5000msec
> > >>>
> > >>> java version
> > >>>
> > >>> long time = System.nanoTime();
> > >>>
> > >>> for(int i=0 ; i<1000 ;i++){
> > >>> String a=i+"another word";
> > >>> }
> > >>>   System.out.println(System.nanoTime()-time);
> > >>>
> > >>>
> > >>> The java version take about 500 msecs, I thought it might be caused
> by
> > >>> the str implementation which is using string builder, and it might
> not be
> > >>> the best choice in the case of no much string to concat, and then I
> replace
> > >>> "another word" with 5 long strings as the parameter, however no
> surprise.
> > >>>
> > >>> I just wonder what make the difference, or how to find the
> difference.
> > >>>
> > >>
> > >> Others have added useful points to this thread. Java string
> concatenation
> > >> internally uses StringBuilder, so if you replace (str n "another
> word")
> > >> with the following:
> > >>
> > >> (let [sb (StringBuilder.)]
> > >>  (.append sb n)
> > >>  (.append sb "another word")
> > >>  (.toString sb))
> > >>
> > >> ..then the perf improves 1/4 to 1/3. Further, with the following
> tweak:
> > >>
> > >> (let [sb (StringBuilder. 20)]  ; because StringBuilder allocates only
> 16
> > >> chars by default on Oracle JRE
> > >>  (.append sb n)
> > >>  (.append sb "another word")
> > >>  (.toString sb))
> > >>
> > >> ..the perf improves from 1/3 to less than 1/2. Here we simply avoid
> > >> double allocation in StringBuilder.
> > >>
> > >> Other things I made sure were:
> > >>
> > >> 1. I used Criterium to measure
> > >> 2. I used `-server` option
> > >> 3. Made sure reflection warning was on
> > >>
> > >> Shantanu
> > >>
> > >  --
> > > You received this message because you are subscribed to the Google
> > > Groups "Clojure" group.
> > > To post to this group, send email to clojure@googlegroups.com
> > > Note that posts from new members are moderated - please be patient with
> > > your first post.
> > > To unsubscribe from this group, send email to
> > > clojure+unsubscr...@googlegroups.com
> > > For more options, visit this group at
> > > http://groups.google.com/group/clojure?hl=en
> > > ---
> > > You received this message because you are subscribed to the Google
> Groups
> > > "Clojure" group.
> > > To unsubscribe from this group and stop receiving emails from it, send
> an
> > > email to clojure+unsubscr...@googlegroups.com.
> > > For more options, visit https://groups.google.com/groups/opt_out.
> > >
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Clojure" group.
> > To post to this group, send email to clojure@googlegroups.com
> > Note that posts from new members are moderated - please be patient with
> your first post.
> > To unsubscribe from this group, send email to
> > clojure+unsubscr...@googlegroups.com
> > For more options, visit this group at
> > http://groups.google.com/group/clojure?hl=en
> > ---
> > You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to clojure+unsubscr...@googlegroups.com.
> > For more options, visit https://groups.google.com/groups/opt_

Re: Clojure performance question

2014-03-02 Thread Luc Prefontaine
I cannot agree with this...
Not at 100% at least.

String manipulations are frequent
enough to mandate some tuning
if the need is obvious.

Looks to me that this is the case here.

Other core fns went through rewrites
to improve performance.

Simplicity has nothing to do with
internal implementations.

If someone comes up with a better
implementation while providing
the same behaviours as the current
str fn, then it should make it's way
maybe in clojure.string.

"fast-strings" ? Whatever it may be
named.

Luc P.

> Core fns should be simple, unsurprising, and general.
> 
> 'Improving' str may hurt simplicity, make behavior more surprising and
> unexpected, and less general unless proven otherwise.
> 
> On Sat, Mar 1, 2014 at 7:02 PM, bob  wrote:
> 
> >
> > Good point, Thanks a lot.
> >
> > Shall we improve the str fn in the core lib? From my point of view, the
> > core fns should be performance sensitive.
> >
> >
> >
> > On Sunday, March 2, 2014 12:03:21 AM UTC+8, Shantanu Kumar wrote:
> >>
> >>
> >>
> >> On Saturday, 1 March 2014 15:32:41 UTC+5:30, bob wrote:
> >>>
> >>> Case :
> >>>
> >>> clojure verison:
> >>>
> >>> (time (dotimes [n 1000] (str n "another word"))) ;; take about
> >>> 5000msec
> >>>
> >>> java version
> >>>
> >>> long time = System.nanoTime();
> >>>
> >>> for(int i=0 ; i<1000 ;i++){
> >>> String a=i+"another word";
> >>> }
> >>>   System.out.println(System.nanoTime()-time);
> >>>
> >>>
> >>> The java version take about 500 msecs, I thought it might be caused by
> >>> the str implementation which is using string builder, and it might not be
> >>> the best choice in the case of no much string to concat, and then I 
> >>> replace
> >>> "another word" with 5 long strings as the parameter, however no surprise.
> >>>
> >>> I just wonder what make the difference, or how to find the difference.
> >>>
> >>
> >> Others have added useful points to this thread. Java string concatenation
> >> internally uses StringBuilder, so if you replace (str n "another word")
> >> with the following:
> >>
> >> (let [sb (StringBuilder.)]
> >>  (.append sb n)
> >>  (.append sb "another word")
> >>  (.toString sb))
> >>
> >> ..then the perf improves 1/4 to 1/3. Further, with the following tweak:
> >>
> >> (let [sb (StringBuilder. 20)]  ; because StringBuilder allocates only 16
> >> chars by default on Oracle JRE
> >>  (.append sb n)
> >>  (.append sb "another word")
> >>  (.toString sb))
> >>
> >> ..the perf improves from 1/3 to less than 1/2. Here we simply avoid
> >> double allocation in StringBuilder.
> >>
> >> Other things I made sure were:
> >>
> >> 1. I used Criterium to measure
> >> 2. I used `-server` option
> >> 3. Made sure reflection warning was on
> >>
> >> Shantanu
> >>
> >  --
> > You received this message because you are subscribed to the Google
> > Groups "Clojure" group.
> > To post to this group, send email to clojure@googlegroups.com
> > Note that posts from new members are moderated - please be patient with
> > your first post.
> > To unsubscribe from this group, send email to
> > clojure+unsubscr...@googlegroups.com
> > For more options, visit this group at
> > http://groups.google.com/group/clojure?hl=en
> > ---
> > You received this message because you are subscribed to the Google Groups
> > "Clojure" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to clojure+unsubscr...@googlegroups.com.
> > For more options, visit https://groups.google.com/groups/opt_out.
> >
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> --- 
> You received this message because you are subscribed to the Google Groups 
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
> 
--
Luc Prefontaine sent by ibisMail!

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more opti

Re: Clojure performance question

2014-03-02 Thread Shantanu Kumar


On Sunday, 2 March 2014 12:49:15 UTC+5:30, Shantanu Kumar wrote:
>
>
>
> On Sunday, 2 March 2014 05:32:00 UTC+5:30, bob wrote:
>>
>>
>> Good point, Thanks a lot. 
>>
>> Shall we improve the str fn in the core lib? From my point of view, the 
>> core fns should be performance sensitive.
>>
>
> If string formation is the bottleneck in your app and if you can come up 
> with a version of `str` function that works in all use-cases, then you can 
> probably `alter-var-root` the str fn with yours as long as you own the 
> responsibility.
>
> I noticed the following macro (ignore the reflection warnings) can help 
> shave some nanoseconds in a large tight loop, but I leave to you to decide 
> how much worth it really is:
>

Just to clarify: I meant `some nanoseconds` per invocation for small string 
only. Overall saving would be proportional to the occurrence count and args 
count.

Shantanu

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-01 Thread Shantanu Kumar


On Sunday, 2 March 2014 05:32:00 UTC+5:30, bob wrote:
>
>
> Good point, Thanks a lot. 
>
> Shall we improve the str fn in the core lib? From my point of view, the 
> core fns should be performance sensitive.
>

If string formation is the bottleneck in your app and if you can come up 
with a version of `str` function that works in all use-cases, then you can 
probably `alter-var-root` the str fn with yours as long as you own the 
responsibility.

I noticed the following macro (ignore the reflection warnings) can help 
shave some nanoseconds in a large tight loop, but I leave to you to decide 
how much worth it really is:

(defmacro sb-str
  [& args]
  (cond (empty? args)  ""
(= 1 (count args)) (let [x (first args)]
 `(let [y# ~x]
(cond (nil? y#)""
  (instance? Boolean   y#) (.toString 
(Boolean.   y#))
  (instance? Byte  y#) (.toString 
(Byte.  y#))
  (instance? Character y#) (.toString 
(Character. y#))
  (instance? Doubley#) (.toString 
(Double.y#))
  (instance? Float y#) (.toString 
(Float. y#))
  (instance? Integer   y#) (.toString 
(Integer.   y#))
  (instance? Long  y#) (.toString 
(Long.  y#))
  (instance? Short y#) (.toString 
(Short. y#))
  :otherwise   (.toString 
y#
:otherwise (let [sb (gensym)
each-append #(list '.append sb %)
all-appends (map each-append args)]
`(let [~sb (StringBuilder.)]
   ~@all-appends
   (.toString ~sb)

Note that it is not a function, so you cannot use it with high order 
functions. You can possibly use `definline` instead of a macro but you lose 
varargs then.

Shantanu

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-01 Thread Gary Trakhman
Core fns should be simple, unsurprising, and general.

'Improving' str may hurt simplicity, make behavior more surprising and
unexpected, and less general unless proven otherwise.

On Sat, Mar 1, 2014 at 7:02 PM, bob  wrote:

>
> Good point, Thanks a lot.
>
> Shall we improve the str fn in the core lib? From my point of view, the
> core fns should be performance sensitive.
>
>
>
> On Sunday, March 2, 2014 12:03:21 AM UTC+8, Shantanu Kumar wrote:
>>
>>
>>
>> On Saturday, 1 March 2014 15:32:41 UTC+5:30, bob wrote:
>>>
>>> Case :
>>>
>>> clojure verison:
>>>
>>> (time (dotimes [n 1000] (str n "another word"))) ;; take about
>>> 5000msec
>>>
>>> java version
>>>
>>> long time = System.nanoTime();
>>>
>>> for(int i=0 ; i<1000 ;i++){
>>> String a=i+"another word";
>>> }
>>>   System.out.println(System.nanoTime()-time);
>>>
>>>
>>> The java version take about 500 msecs, I thought it might be caused by
>>> the str implementation which is using string builder, and it might not be
>>> the best choice in the case of no much string to concat, and then I replace
>>> "another word" with 5 long strings as the parameter, however no surprise.
>>>
>>> I just wonder what make the difference, or how to find the difference.
>>>
>>
>> Others have added useful points to this thread. Java string concatenation
>> internally uses StringBuilder, so if you replace (str n "another word")
>> with the following:
>>
>> (let [sb (StringBuilder.)]
>>  (.append sb n)
>>  (.append sb "another word")
>>  (.toString sb))
>>
>> ..then the perf improves 1/4 to 1/3. Further, with the following tweak:
>>
>> (let [sb (StringBuilder. 20)]  ; because StringBuilder allocates only 16
>> chars by default on Oracle JRE
>>  (.append sb n)
>>  (.append sb "another word")
>>  (.toString sb))
>>
>> ..the perf improves from 1/3 to less than 1/2. Here we simply avoid
>> double allocation in StringBuilder.
>>
>> Other things I made sure were:
>>
>> 1. I used Criterium to measure
>> 2. I used `-server` option
>> 3. Made sure reflection warning was on
>>
>> Shantanu
>>
>  --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-01 Thread bob

Good point, Thanks a lot. 

Shall we improve the str fn in the core lib? From my point of view, the 
core fns should be performance sensitive.



On Sunday, March 2, 2014 12:03:21 AM UTC+8, Shantanu Kumar wrote:
>
>
>
> On Saturday, 1 March 2014 15:32:41 UTC+5:30, bob wrote:
>>
>> Case :
>>
>> clojure verison:
>>
>> (time (dotimes [n 1000] (str n "another word"))) ;; take about 
>> 5000msec
>>
>> java version
>>
>> long time = System.nanoTime();
>>
>> for(int i=0 ; i<1000 ;i++){
>> String a=i+"another word";
>> }
>>   System.out.println(System.nanoTime()-time); 
>>  
>>
>> The java version take about 500 msecs, I thought it might be caused by 
>> the str implementation which is using string builder, and it might not be 
>> the best choice in the case of no much string to concat, and then I replace 
>> "another word" with 5 long strings as the parameter, however no surprise.
>>
>> I just wonder what make the difference, or how to find the difference.
>>
>
> Others have added useful points to this thread. Java string concatenation 
> internally uses StringBuilder, so if you replace (str n "another word") 
> with the following:
>
> (let [sb (StringBuilder.)]
>  (.append sb n)
>  (.append sb "another word")
>  (.toString sb))
>
> ..then the perf improves 1/4 to 1/3. Further, with the following tweak:
>
> (let [sb (StringBuilder. 20)]  ; because StringBuilder allocates only 16 
> chars by default on Oracle JRE
>  (.append sb n)
>  (.append sb "another word")
>  (.toString sb))
>
> ..the perf improves from 1/3 to less than 1/2. Here we simply avoid double 
> allocation in StringBuilder.
>
> Other things I made sure were:
>
> 1. I used Criterium to measure
> 2. I used `-server` option
> 3. Made sure reflection warning was on
>
> Shantanu
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-01 Thread Shantanu Kumar


On Saturday, 1 March 2014 15:32:41 UTC+5:30, bob wrote:
>
> Case :
>
> clojure verison:
>
> (time (dotimes [n 1000] (str n "another word"))) ;; take about 5000msec
>
> java version
>
> long time = System.nanoTime();
>
> for(int i=0 ; i<1000 ;i++){
> String a=i+"another word";
> }
>   System.out.println(System.nanoTime()-time); 
>  
>
> The java version take about 500 msecs, I thought it might be caused by the 
> str implementation which is using string builder, and it might not be the 
> best choice in the case of no much string to concat, and then I replace 
> "another word" with 5 long strings as the parameter, however no surprise.
>
> I just wonder what make the difference, or how to find the difference.
>

Others have added useful points to this thread. Java string concatenation 
internally uses StringBuilder, so if you replace (str n "another word") 
with the following:

(let [sb (StringBuilder.)]
 (.append sb n)
 (.append sb "another word")
 (.toString sb))

..then the perf improves 1/4 to 1/3. Further, with the following tweak:

(let [sb (StringBuilder. 20)]  ; because StringBuilder allocates only 16 
chars by default on Oracle JRE
 (.append sb n)
 (.append sb "another word")
 (.toString sb))

..the perf improves from 1/3 to less than 1/2. Here we simply avoid double 
allocation in StringBuilder.

Other things I made sure were:

1. I used Criterium to measure
2. I used `-server` option
3. Made sure reflection warning was on

Shantanu

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-01 Thread Jozef Wagner
>From my experience, you can get 1/1.2 performance (20% slower) with Clojure 
by following these steps:

* Choose right clojure idioms

That means to never use lazy seqs if you care for java-like performance. 
For fast looping, use reducers, for fast inserts, use transients. 
Postpone the data creation until the last step. Compose functions instead 
of piping collections containing intermediate results.
Know the tradeoffs between similar functions (e.g. rseq and reverse)
Use custom datatype with ^:unsynchronized-mutable for performance critical 
mutations, instead of atoms or vars.

* Use domain knowledge for your advantage 

Collection has a random access? Use it to your advantage. 
Is it backed by array? Do a bulk copy instead of one element per iteration.
Is memory killing you? Use subcollections which share underlying data with 
original ones.
If shooting for performance, you do not want the most generic solution. 
Search for more performant solutions which have trade-offs you can live 
with.
Do not use polymorphic dispatch if you have a closed set of options. If you 
do, prefer protocols to multimethods.

* Trade between memory and CPU (and precision)

Have a CPU intensive pure function? Memoize.
Using too much memory? Use lazy seqs, delays, dropping caches.
(Do not care for precision? Round, truncate and approximate. Use decayed 
collections, frugal streaming, ...)

* Paralellize

Clojure's approach to the concurrency allows you to focus on the problem 
and not fight much with the synchronization details. 
Using and customizing fold idiom is a good start.
(What would be a good addition to the clojure is a simple abstraction on 
top of executors and forkjoin pool for even more fine tuning.)

* Know your host

Use type hints to get rid of reflection and boxing. Most of time you can 
eliminate all reflections, but it is very hard to eliminate every automatic 
boxing/unboxing without dropping down to java.
Heavy IO use? Build your abstractions around java.nio.Buffer.
Looping with ints is faster than with longs.


While any part of your code should be idiomatic and with good design 
behind, use heavy optimization 'tricks' only to those parts of the code 
which bring the most effect. Your time can be spent on more useful things 
than optimizing some auxilliary functionality.

Best,
Jozef


On Saturday, March 1, 2014 6:02:26 AM UTC+1, bob wrote:
>
> Hi,
>
> Can I ask a newbie question about clojure performance?
>
> What make clojure performance slow than java?, it seems clojure has the 
> 1/4 performance compared to java in general, according to  tests, some 
> cases it might be 1/10. the reasons I can think out are 
>
> - the byte code is not efficient sometimes
> - the byte code might not enjoy the jvm optimization
> - the reflection 
> - the immutable data structure
> - the abstract interface design
>
> The abstract interface like seq offers its power, but it is easy to drop 
> in the performance trap.
>
> And it seems to me that it is easy to write a slow clojure program, I know 
> the efficiency of code depends on coder, you can write the code faster than 
> java sometimes,but  need to know a lot of deep thing and tricky, and 
> clojure is not the funny clojure any more.
>
>
> Thanks
>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-03-01 Thread dennis zhuang
Yep, you are right, inspecting the byte code generated by the clojure code:

 (loop [n 0]
(when (< n 1000)
  (-> (StringBuilder.) (.append n) (.append "another word")
(.toString))
  (recur (unchecked-inc n

It is:


L4
LINENUMBER 4 L4
LLOAD 1
LDC 1000
LCMP
IFGE L5
   L6
LINENUMBER 5 L6
   L7
LINENUMBER 5 L7
   L8
LINENUMBER 5 L8
NEW java/lang/StringBuilder
DUP
INVOKESPECIAL java/lang/StringBuilder. ()V
CHECKCAST java/lang/StringBuilder
LLOAD 1
INVOKEVIRTUAL java/lang/StringBuilder.append
(J)Ljava/lang/StringBuilder;
CHECKCAST java/lang/StringBuilder
LDC "another word"
CHECKCAST java/lang/String
INVOKEVIRTUAL java/lang/StringBuilder.append
(Ljava/lang/String;)Ljava/lang/StringBuilder;
CHECKCAST java/lang/StringBuilder
INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
POP
   L9
LINENUMBER 6 L9
LLOAD 1
LCONST_1
LADD
LSTORE 1
GOTO L2
GOTO L10
   L11
POP
   L5
ACONST_NULL


It's almost the same with java compiled byte code,except using long type's
instruments (LCMP and LADD etc) and some CHECKCAST (cast type) instruments.





2014-03-01 21:00 GMT+08:00 Jozef Wagner :

> Clojure math functions compile down to the same JVM 'instruction' as from
> java. See http://galdolber.tumblr.com/post/77153377251/clojure-intrinsics
>
>
>
> On Sat, Mar 1, 2014 at 1:23 PM, dennis zhuang wrote:
>
>> I think the remaining overhead of clojure sample code is that operators
>> in java such as '++' and '<" etc.They are just an instrument of JVM -- iinc
>> and if_icmpge. But they are both functions in clojure,and they will be
>> called by invokevirtual instrument.It cost much more performance.
>>
>>
>>
>>
>> 2014-03-01 20:07 GMT+08:00 dennis zhuang :
>>
>> I forgot to note hat i test the java sample and clojure sample code with
>>> the same jvm options '-server'.
>>>
>>>
>>>
>>> 2014-03-01 20:03 GMT+08:00 dennis zhuang :
>>>
>>> The "String a=i+"another word";" is also compiled into using
  StringBuilder, see the byte code by javap -v:

Code:
   stack=5, locals=5, args_size=1
  0: invokestatic  #2  // Method
 java/lang/System.nanoTime:()J
  3: lstore_1
  4: iconst_0
  5: istore_3
  6: iload_3
  7: ldc   #3  // int 1000
  9: if_icmpge 39
 12: new   #4  // class
 java/lang/StringBuilder
 15: dup
 16: invokespecial #5  // Method
 java/lang/StringBuilder."":()V
 19: iload_3
 20: invokevirtual #6  // Method
 java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
 23: ldc   #7  // String another word
 25: invokevirtual #8  // Method
 java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
 28: invokevirtual #9  // Method
 java/lang/StringBuilder.toString:()Ljava/lang/String;
 31: astore4
 33: iinc  3, 1
 36: goto  6
 39: getstatic #10 // Field
 java/lang/System.out:Ljava/io/PrintStream;
 42: invokestatic  #2  // Method
 java/lang/System.nanoTime:()J
 45: lload_1
 46: lsub
 47: l2d
 48: ldc2_w#11 // double 1.0E9d
 51: ddiv
 52: invokevirtual #13 // Method
 java/io/PrintStream.println:(D)V


 I think the performance hotspot in this simple example is the object
 allocate/gc  and function calling overhead.The str function create
 an anonymous function every time to concat argument strings:

 (^String [x & ys]
  ((fn [^StringBuilder sb more]
   (if more
 (recur (. sb  (append (str (first more (next more))
 (str sb)))
   (new StringBuilder (str x)) ys)))

 And we all know that a function in clojure is a java object allocated
 in heap.And another overhead is calling the function,it's virtual method.

 By watching the gc statistics using 'jstat -gcutil  2000', i found
 that the clojure sample ran about 670 minor gc,but the java sample is only
 120 minor gc.

 A improved clojure version,it's performance is closed to java sample:

 user=> (time (dotimes [n 1000] (-> (StringBuilder.) (.append n)
 (.append "another word") (.toString
 "Elapsed time: 1009.942 msecs"




 2014-03-01 18:02 GMT+08:00 bob :

 Case :
>
> clojure verison:
>
> (time (dotimes [n 1000] (str n "another word"))) ;; take about
> 5000msec
>
> java version
>>>

Re: Clojure performance question

2014-03-01 Thread Jozef Wagner
Clojure math functions compile down to the same JVM 'instruction' as from
java. See http://galdolber.tumblr.com/post/77153377251/clojure-intrinsics


On Sat, Mar 1, 2014 at 1:23 PM, dennis zhuang  wrote:

> I think the remaining overhead of clojure sample code is that operators in
> java such as '++' and '<" etc.They are just an instrument of JVM -- iinc
> and if_icmpge. But they are both functions in clojure,and they will be
> called by invokevirtual instrument.It cost much more performance.
>
>
>
>
> 2014-03-01 20:07 GMT+08:00 dennis zhuang :
>
> I forgot to note hat i test the java sample and clojure sample code with
>> the same jvm options '-server'.
>>
>>
>>
>> 2014-03-01 20:03 GMT+08:00 dennis zhuang :
>>
>> The "String a=i+"another word";" is also compiled into using
>>>  StringBuilder, see the byte code by javap -v:
>>>
>>>Code:
>>>   stack=5, locals=5, args_size=1
>>>  0: invokestatic  #2  // Method
>>> java/lang/System.nanoTime:()J
>>>  3: lstore_1
>>>  4: iconst_0
>>>  5: istore_3
>>>  6: iload_3
>>>  7: ldc   #3  // int 1000
>>>  9: if_icmpge 39
>>> 12: new   #4  // class
>>> java/lang/StringBuilder
>>> 15: dup
>>> 16: invokespecial #5  // Method
>>> java/lang/StringBuilder."":()V
>>> 19: iload_3
>>> 20: invokevirtual #6  // Method
>>> java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
>>> 23: ldc   #7  // String another word
>>> 25: invokevirtual #8  // Method
>>> java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
>>> 28: invokevirtual #9  // Method
>>> java/lang/StringBuilder.toString:()Ljava/lang/String;
>>> 31: astore4
>>> 33: iinc  3, 1
>>> 36: goto  6
>>> 39: getstatic #10 // Field
>>> java/lang/System.out:Ljava/io/PrintStream;
>>> 42: invokestatic  #2  // Method
>>> java/lang/System.nanoTime:()J
>>> 45: lload_1
>>> 46: lsub
>>> 47: l2d
>>> 48: ldc2_w#11 // double 1.0E9d
>>> 51: ddiv
>>> 52: invokevirtual #13 // Method
>>> java/io/PrintStream.println:(D)V
>>>
>>>
>>> I think the performance hotspot in this simple example is the object
>>> allocate/gc  and function calling overhead.The str function create
>>> an anonymous function every time to concat argument strings:
>>>
>>> (^String [x & ys]
>>>  ((fn [^StringBuilder sb more]
>>>   (if more
>>> (recur (. sb  (append (str (first more (next more))
>>> (str sb)))
>>>   (new StringBuilder (str x)) ys)))
>>>
>>> And we all know that a function in clojure is a java object allocated in
>>> heap.And another overhead is calling the function,it's virtual method.
>>>
>>> By watching the gc statistics using 'jstat -gcutil  2000', i found
>>> that the clojure sample ran about 670 minor gc,but the java sample is only
>>> 120 minor gc.
>>>
>>> A improved clojure version,it's performance is closed to java sample:
>>>
>>> user=> (time (dotimes [n 1000] (-> (StringBuilder.) (.append n)
>>> (.append "another word") (.toString
>>> "Elapsed time: 1009.942 msecs"
>>>
>>>
>>>
>>>
>>> 2014-03-01 18:02 GMT+08:00 bob :
>>>
>>> Case :

 clojure verison:

 (time (dotimes [n 1000] (str n "another word"))) ;; take about
 5000msec

 java version

 long time = System.nanoTime();

 for(int i=0 ; i<1000 ;i++){
 String a=i+"another word";
 }
   System.out.println(System.nanoTime()-time);


 The java version take about 500 msecs, I thought it might be caused by
 the str implementation which is using string builder, and it might not be
 the best choice in the case of no much string to concat, and then I replace
 "another word" with 5 long strings as the parameter, however no surprise.

 I just wonder what make the difference, or how to find the difference.

 Thanks



 On Saturday, March 1, 2014 1:26:38 PM UTC+8, Shantanu Kumar wrote:
>
> I have seen (and I keep seeing) a ton of Java code that performs
> poorly. Empirically, it's equally easy to write a slow Java app. You 
> always
> need a discerning programmer to get good performance from any 
> language/tool.
>
> Numbers like 1/4 or 1/10 can be better discussed in presence of the
> use-cases and perf test cases. Most of the problems you listed can be
> mitigated by `-server` JIT, avoiding reflection, transients, loop-recur,
> arrays, perf libraries and some Java code.
>
> Shantanu
>
  --
 You received this message because you are subscribed to the 

Re: Clojure performance question

2014-03-01 Thread dennis zhuang
I think the remaining overhead of clojure sample code is that operators in
java such as '++' and '<" etc.They are just an instrument of JVM -- iinc
and if_icmpge. But they are both functions in clojure,and they will be
called by invokevirtual instrument.It cost much more performance.




2014-03-01 20:07 GMT+08:00 dennis zhuang :

> I forgot to note hat i test the java sample and clojure sample code with
> the same jvm options '-server'.
>
>
>
> 2014-03-01 20:03 GMT+08:00 dennis zhuang :
>
> The "String a=i+"another word";" is also compiled into using
>>  StringBuilder, see the byte code by javap -v:
>>
>>Code:
>>   stack=5, locals=5, args_size=1
>>  0: invokestatic  #2  // Method
>> java/lang/System.nanoTime:()J
>>  3: lstore_1
>>  4: iconst_0
>>  5: istore_3
>>  6: iload_3
>>  7: ldc   #3  // int 1000
>>  9: if_icmpge 39
>> 12: new   #4  // class
>> java/lang/StringBuilder
>> 15: dup
>> 16: invokespecial #5  // Method
>> java/lang/StringBuilder."":()V
>> 19: iload_3
>> 20: invokevirtual #6  // Method
>> java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
>> 23: ldc   #7  // String another word
>> 25: invokevirtual #8  // Method
>> java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
>> 28: invokevirtual #9  // Method
>> java/lang/StringBuilder.toString:()Ljava/lang/String;
>> 31: astore4
>> 33: iinc  3, 1
>> 36: goto  6
>> 39: getstatic #10 // Field
>> java/lang/System.out:Ljava/io/PrintStream;
>> 42: invokestatic  #2  // Method
>> java/lang/System.nanoTime:()J
>> 45: lload_1
>> 46: lsub
>> 47: l2d
>> 48: ldc2_w#11 // double 1.0E9d
>> 51: ddiv
>> 52: invokevirtual #13 // Method
>> java/io/PrintStream.println:(D)V
>>
>>
>> I think the performance hotspot in this simple example is the object
>> allocate/gc  and function calling overhead.The str function create
>> an anonymous function every time to concat argument strings:
>>
>> (^String [x & ys]
>>  ((fn [^StringBuilder sb more]
>>   (if more
>> (recur (. sb  (append (str (first more (next more))
>> (str sb)))
>>   (new StringBuilder (str x)) ys)))
>>
>> And we all know that a function in clojure is a java object allocated in
>> heap.And another overhead is calling the function,it's virtual method.
>>
>> By watching the gc statistics using 'jstat -gcutil  2000', i found
>> that the clojure sample ran about 670 minor gc,but the java sample is only
>> 120 minor gc.
>>
>> A improved clojure version,it's performance is closed to java sample:
>>
>> user=> (time (dotimes [n 1000] (-> (StringBuilder.) (.append n)
>> (.append "another word") (.toString
>> "Elapsed time: 1009.942 msecs"
>>
>>
>>
>>
>> 2014-03-01 18:02 GMT+08:00 bob :
>>
>> Case :
>>>
>>> clojure verison:
>>>
>>> (time (dotimes [n 1000] (str n "another word"))) ;; take about
>>> 5000msec
>>>
>>> java version
>>>
>>> long time = System.nanoTime();
>>>
>>> for(int i=0 ; i<1000 ;i++){
>>> String a=i+"another word";
>>> }
>>>   System.out.println(System.nanoTime()-time);
>>>
>>>
>>> The java version take about 500 msecs, I thought it might be caused by
>>> the str implementation which is using string builder, and it might not be
>>> the best choice in the case of no much string to concat, and then I replace
>>> "another word" with 5 long strings as the parameter, however no surprise.
>>>
>>> I just wonder what make the difference, or how to find the difference.
>>>
>>> Thanks
>>>
>>>
>>>
>>> On Saturday, March 1, 2014 1:26:38 PM UTC+8, Shantanu Kumar wrote:

 I have seen (and I keep seeing) a ton of Java code that performs
 poorly. Empirically, it's equally easy to write a slow Java app. You always
 need a discerning programmer to get good performance from any 
 language/tool.

 Numbers like 1/4 or 1/10 can be better discussed in presence of the
 use-cases and perf test cases. Most of the problems you listed can be
 mitigated by `-server` JIT, avoiding reflection, transients, loop-recur,
 arrays, perf libraries and some Java code.

 Shantanu

>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To post to this group, send email to clojure@googlegroups.com
>>> Note that posts from new members are moderated - please be patient with
>>> your first post.
>>> To unsubscribe from this group, send email to
>>> clojure+unsubscr...@googlegroups.com
>>> For more options, visit this group at
>>> http://groups

Re: Clojure performance question

2014-03-01 Thread dennis zhuang
I forgot to note hat i test the java sample and clojure sample code with
the same jvm options '-server'.



2014-03-01 20:03 GMT+08:00 dennis zhuang :

> The "String a=i+"another word";" is also compiled into using
>  StringBuilder, see the byte code by javap -v:
>
>Code:
>   stack=5, locals=5, args_size=1
>  0: invokestatic  #2  // Method
> java/lang/System.nanoTime:()J
>  3: lstore_1
>  4: iconst_0
>  5: istore_3
>  6: iload_3
>  7: ldc   #3  // int 1000
>  9: if_icmpge 39
> 12: new   #4  // class
> java/lang/StringBuilder
> 15: dup
> 16: invokespecial #5  // Method
> java/lang/StringBuilder."":()V
> 19: iload_3
> 20: invokevirtual #6  // Method
> java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
> 23: ldc   #7  // String another word
> 25: invokevirtual #8  // Method
> java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
> 28: invokevirtual #9  // Method
> java/lang/StringBuilder.toString:()Ljava/lang/String;
> 31: astore4
> 33: iinc  3, 1
> 36: goto  6
> 39: getstatic #10 // Field
> java/lang/System.out:Ljava/io/PrintStream;
> 42: invokestatic  #2  // Method
> java/lang/System.nanoTime:()J
> 45: lload_1
> 46: lsub
> 47: l2d
> 48: ldc2_w#11 // double 1.0E9d
> 51: ddiv
> 52: invokevirtual #13 // Method
> java/io/PrintStream.println:(D)V
>
>
> I think the performance hotspot in this simple example is the object
> allocate/gc  and function calling overhead.The str function create
> an anonymous function every time to concat argument strings:
>
> (^String [x & ys]
>  ((fn [^StringBuilder sb more]
>   (if more
> (recur (. sb  (append (str (first more (next more))
> (str sb)))
>   (new StringBuilder (str x)) ys)))
>
> And we all know that a function in clojure is a java object allocated in
> heap.And another overhead is calling the function,it's virtual method.
>
> By watching the gc statistics using 'jstat -gcutil  2000', i found
> that the clojure sample ran about 670 minor gc,but the java sample is only
> 120 minor gc.
>
> A improved clojure version,it's performance is closed to java sample:
>
> user=> (time (dotimes [n 1000] (-> (StringBuilder.) (.append n)
> (.append "another word") (.toString
> "Elapsed time: 1009.942 msecs"
>
>
>
>
> 2014-03-01 18:02 GMT+08:00 bob :
>
> Case :
>>
>> clojure verison:
>>
>> (time (dotimes [n 1000] (str n "another word"))) ;; take about
>> 5000msec
>>
>> java version
>>
>> long time = System.nanoTime();
>>
>> for(int i=0 ; i<1000 ;i++){
>> String a=i+"another word";
>> }
>>   System.out.println(System.nanoTime()-time);
>>
>>
>> The java version take about 500 msecs, I thought it might be caused by
>> the str implementation which is using string builder, and it might not be
>> the best choice in the case of no much string to concat, and then I replace
>> "another word" with 5 long strings as the parameter, however no surprise.
>>
>> I just wonder what make the difference, or how to find the difference.
>>
>> Thanks
>>
>>
>>
>> On Saturday, March 1, 2014 1:26:38 PM UTC+8, Shantanu Kumar wrote:
>>>
>>> I have seen (and I keep seeing) a ton of Java code that performs poorly.
>>> Empirically, it's equally easy to write a slow Java app. You always need a
>>> discerning programmer to get good performance from any language/tool.
>>>
>>> Numbers like 1/4 or 1/10 can be better discussed in presence of the
>>> use-cases and perf test cases. Most of the problems you listed can be
>>> mitigated by `-server` JIT, avoiding reflection, transients, loop-recur,
>>> arrays, perf libraries and some Java code.
>>>
>>> Shantanu
>>>
>>  --
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to clojure@googlegroups.com
>> Note that posts from new members are moderated - please be patient with
>> your first post.
>> To unsubscribe from this group, send email to
>> clojure+unsubscr...@googlegroups.com
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "Clojure" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to clojure+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>
>
> --
> 庄晓丹
> Email:killme2...@gmail.com xzhu...@avos.com
> Site:   http://fnil.net
> Twitter:  @killme2008
>

Re: Clojure performance question

2014-03-01 Thread dennis zhuang
The "String a=i+"another word";" is also compiled into using
 StringBuilder, see the byte code by javap -v:

   Code:
  stack=5, locals=5, args_size=1
 0: invokestatic  #2  // Method
java/lang/System.nanoTime:()J
 3: lstore_1
 4: iconst_0
 5: istore_3
 6: iload_3
 7: ldc   #3  // int 1000
 9: if_icmpge 39
12: new   #4  // class
java/lang/StringBuilder
15: dup
16: invokespecial #5  // Method
java/lang/StringBuilder."":()V
19: iload_3
20: invokevirtual #6  // Method
java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
23: ldc   #7  // String another word
25: invokevirtual #8  // Method
java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
28: invokevirtual #9  // Method
java/lang/StringBuilder.toString:()Ljava/lang/String;
31: astore4
33: iinc  3, 1
36: goto  6
39: getstatic #10 // Field
java/lang/System.out:Ljava/io/PrintStream;
42: invokestatic  #2  // Method
java/lang/System.nanoTime:()J
45: lload_1
46: lsub
47: l2d
48: ldc2_w#11 // double 1.0E9d
51: ddiv
52: invokevirtual #13 // Method
java/io/PrintStream.println:(D)V


I think the performance hotspot in this simple example is the object
allocate/gc  and function calling overhead.The str function create
an anonymous function every time to concat argument strings:

(^String [x & ys]
 ((fn [^StringBuilder sb more]
  (if more
(recur (. sb  (append (str (first more (next more))
(str sb)))
  (new StringBuilder (str x)) ys)))

And we all know that a function in clojure is a java object allocated in
heap.And another overhead is calling the function,it's virtual method.

By watching the gc statistics using 'jstat -gcutil  2000', i found
that the clojure sample ran about 670 minor gc,but the java sample is only
120 minor gc.

A improved clojure version,it's performance is closed to java sample:

user=> (time (dotimes [n 1000] (-> (StringBuilder.) (.append n)
(.append "another word") (.toString
"Elapsed time: 1009.942 msecs"




2014-03-01 18:02 GMT+08:00 bob :

> Case :
>
> clojure verison:
>
> (time (dotimes [n 1000] (str n "another word"))) ;; take about 5000msec
>
> java version
>
> long time = System.nanoTime();
>
> for(int i=0 ; i<1000 ;i++){
> String a=i+"another word";
> }
>   System.out.println(System.nanoTime()-time);
>
>
> The java version take about 500 msecs, I thought it might be caused by the
> str implementation which is using string builder, and it might not be the
> best choice in the case of no much string to concat, and then I replace
> "another word" with 5 long strings as the parameter, however no surprise.
>
> I just wonder what make the difference, or how to find the difference.
>
> Thanks
>
>
>
> On Saturday, March 1, 2014 1:26:38 PM UTC+8, Shantanu Kumar wrote:
>>
>> I have seen (and I keep seeing) a ton of Java code that performs poorly.
>> Empirically, it's equally easy to write a slow Java app. You always need a
>> discerning programmer to get good performance from any language/tool.
>>
>> Numbers like 1/4 or 1/10 can be better discussed in presence of the
>> use-cases and perf test cases. Most of the problems you listed can be
>> mitigated by `-server` JIT, avoiding reflection, transients, loop-recur,
>> arrays, perf libraries and some Java code.
>>
>> Shantanu
>>
>  --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>



-- 
庄晓丹
Email:killme2...@gmail.com xzhu...@avos.com
Site:   http://fnil.net
Twitter:  @killme2008

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit

Re: Clojure performance question

2014-03-01 Thread bob
Case :

clojure verison:

(time (dotimes [n 1000] (str n "another word"))) ;; take about 5000msec

java version

long time = System.nanoTime();

for(int i=0 ; i<1000 ;i++){
String a=i+"another word";
}
  System.out.println(System.nanoTime()-time); 
 

The java version take about 500 msecs, I thought it might be caused by the 
str implementation which is using string builder, and it might not be the 
best choice in the case of no much string to concat, and then I replace 
"another word" with 5 long strings as the parameter, however no surprise.

I just wonder what make the difference, or how to find the difference.

Thanks



On Saturday, March 1, 2014 1:26:38 PM UTC+8, Shantanu Kumar wrote:
>
> I have seen (and I keep seeing) a ton of Java code that performs poorly. 
> Empirically, it's equally easy to write a slow Java app. You always need a 
> discerning programmer to get good performance from any language/tool.
>
> Numbers like 1/4 or 1/10 can be better discussed in presence of the 
> use-cases and perf test cases. Most of the problems you listed can be 
> mitigated by `-server` JIT, avoiding reflection, transients, loop-recur, 
> arrays, perf libraries and some Java code.
>
> Shantanu
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Clojure performance question

2014-02-28 Thread Shantanu Kumar
I have seen (and I keep seeing) a ton of Java code that performs poorly. 
Empirically, it's equally easy to write a slow Java app. You always need a 
discerning programmer to get good performance from any language/tool.

Numbers like 1/4 or 1/10 can be better discussed in presence of the 
use-cases and perf test cases. Most of the problems you listed can be 
mitigated by `-server` JIT, avoiding reflection, transients, loop-recur, 
arrays, perf libraries and some Java code.

Shantanu

On Saturday, 1 March 2014 10:32:26 UTC+5:30, bob wrote:
>
> Hi,
>
> Can I ask a newbie question about clojure performance?
>
> What make clojure performance slow than java?, it seems clojure has the 
> 1/4 performance compared to java in general, according to  tests, some 
> cases it might be 1/10. the reasons I can think out are 
>
> - the byte code is not efficient sometimes
> - the byte code might not enjoy the jvm optimization
> - the reflection 
> - the immutable data structure
> - the abstract interface design
>
> The abstract interface like seq offers its power, but it is easy to drop 
> in the performance trap.
>
> And it seems to me that it is easy to write a slow clojure program, I know 
> the efficiency of code depends on coder, you can write the code faster than 
> java sometimes,but  need to know a lot of deep thing and tricky, and 
> clojure is not the funny clojure any more.
>
>
> Thanks
>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Clojure performance question

2014-02-28 Thread bob
Hi,

Can I ask a newbie question about clojure performance?

What make clojure performance slow than java?, it seems clojure has the 1/4 
performance compared to java in general, according to  tests, some cases it 
might be 1/10. the reasons I can think out are 

- the byte code is not efficient sometimes
- the byte code might not enjoy the jvm optimization
- the reflection 
- the immutable data structure
- the abstract interface design

The abstract interface like seq offers its power, but it is easy to drop in 
the performance trap.

And it seems to me that it is easy to write a slow clojure program, I know 
the efficiency of code depends on coder, you can write the code faster than 
java sometimes,but  need to know a lot of deep thing and tricky, and 
clojure is not the funny clojure any more.


Thanks

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Help on a Clojure performance question

2011-07-08 Thread Kenny Stone
JVM is very slow to start.  Try measuring around your method calls instead.

Also try running it for a long enough time to see the JVM GC kick the butt
of python's GC...

On Fri, Jul 8, 2011 at 6:19 PM, Michael Klishin  wrote:

> 2011/7/9 Christopher 
>
>> % time cake run mapper.clj < input.txt
>> real0m3.573s
>> user0m2.031s
>> sys 0m1.528s
>>
>
> These numbers include JVM startup overhead (which is significant compared
> to Python startup overhead).
>
> --
> MK
>
> http://github.com/michaelklishin
> http://twitter.com/michaelklishin
>
>
>  --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Help on a Clojure performance question

2011-07-08 Thread Christopher
Hi David,

Thanks for the comments and the code rewrite. This is excellent
information. I just tried it out on my own system and got the same
results. This is a really great example of how to optimize Clojure
code. I'm considering using Clojure for some more research-oriented
work where I will need to analyze large chunks of data and getting
insight like this into how to properly optimize the code is
invaluable.

Thanks a bunch you guys for all the help, I really appreciate it and I
learned quite a bit.

Christopher

On Jul 8, 6:23 pm, David Nolen  wrote:
> Here's a very ugly low-level version just to show that it can be done:
>
> (ns clj-play.mapper
>   (:use [clojure.java.io :only [reader]])
>   (:use [clojure.string :only [split]])
>   (:gen-class))
>
> (set! *warn-on-reflection* true)
>
> (defn mapper [^java.io.BufferedReader r ^java.io.OutputStreamWriter out]
>   (loop [^String line (.readLine r)]
>     (when line
>       (doseq [^String word (.split line "\\s+")]
>         (.append out (.concat word "\t1\n"))
>         (.flush out))
>       (recur (.readLine r)
>
> (defn -main
>   []
>   (mapper (reader *in*) *out*))
>
> I see that the Python version and the Clojure version are identical ~14.7-8s
> for 20 copies of the text so this looks like it's pretty much IO bound at
> this point.
>
> David
>
>
>
> On Fri, Jul 8, 2011 at 9:04 PM, David Nolen  wrote:
> > Running a program like that with cake run is awful, use AOT:
>
> > (ns clj-play.mapper
> >   (:use [clojure.java.io :only [reader]])
> >   (:use [clojure.string :only [split]])
> >   (:gen-class))
>
> > (defn mapper [lines]
> >   (doseq [line lines]
> >     (doseq [word (split line #"\s+")]
> >            (println (str word "\t1")
>
> > (defn -main
> >   []
> >   (mapper (line-seq (reader *in*
>
> > Run with something like:
>
> > time java -server -cp ./classes:lib/clojure-1.3.0-beta1.jar foo.mapper <
> > input.txt
>
> > I see that this takes around 16s w/ 20 copies of the text. Python is 13s
> > seconds. Use some lower level Java facilities and you'll likely trounce the
> > Python.
>
> > David
>
> > On Fri, Jul 8, 2011 at 7:05 PM, Christopher  wrote:
>
> >> Hi all,
>
> >> I have recently been watching a set of videos from O'Reilly on
> >> MapReduce. The author of the series is using Python for all of the
> >> examples, but, in an effort to use Clojure more, I've been following
> >> along and writing my code in Clojure. When I implemented the mapper
> >> function that he described in both languages, I noticed that the
> >> Python version was running quite a bit faster and I was wondering if
> >> you all could help me understand why that is the case. I've pasted the
> >> code for each solution below. Also, I am using cake to run the Clojure
> >> code so my thoughts are, since it keeps a JVM up and running at all
> >> times, that should remove the JVM startup time from the equation. The
> >> input file that I am using is the Hound of the Baskervilles from
> >> Project Guttenberg (http://www.gutenberg.org/cache/epub/2852/
> >> pg2852.txt). I've also noticed that with an even longer text as input
> >> (for example, I copied the text of the input.txt 10 times into a file)
> >> the Clojure code slows significantly more. In some cases I had to just
> >> stop the code with a Ctrl-c. Any ideas you all have on what could be
> >> causing this would be great. I'm not trying to start any battles
> >> between Python and Clojure, as I love them both, I'm strictly trying
> >> to learn how to be a better programmer in Clojure.
>
> >> Thanks ahead of time for any help you all can give.
>
> >> Christopher
>
> >> ;; mapper.clj
>
> >> (use ['clojure.java.io :only '(reader)])
> >> (use ['clojure.string :only '(split)])
>
> >> (defn mapper [lines]
> >>  (doseq [line lines]
> >>    (doseq [word (split line #"\s+")]
> >>      (println (str word "\t1")
>
> >> (mapper (line-seq (reader *in*)))
>
> >> I am running the code above with the following command and I get the
> >> output below
>
> >> % time cake run mapper.clj < input.txt
> >> real    0m3.573s
> >> user    0m2.031s
> >> sys     0m1.528s
>
> >> # mapper.py
>
> >> #!/usr/bin/env
> >> python
>
> >> import sys
>
> >> def mapper(lines):
> >>    for line in lines:
> >>        words = line.split()
> >>        for word in words:
> >>            print "{0}\t1".format(word)
>
> >> def main():
> >>    mapper(sys.stdin)
>
> >> if __name__ == '__main__':
> >>    main()
>
> >> % time mapper.py < input.txt
> >> real    0m0.661s
> >> user    0m0.105s
> >> sys     0m0.083s
>
> >> --
> >> You received this message because you are subscribed to the Google
> >> Groups "Clojure" group.
> >> To post to this group, send email to clojure@googlegroups.com
> >> Note that posts from new members are moderated - please be patient with
> >> your first post.
> >> To unsubscribe from this group, send email to
> >> clojure+unsubscr...@googlegroups.com
> >> For more options, visit this group at
> >>http://groups.google.com/group/clo

Re: Help on a Clojure performance question

2011-07-08 Thread Christopher
Thanks Benny. I tried again without using cake and just compiling the
code into a jar and it does execute much better. I guess using the
cake run command as a way to avoid the JVM startup overhead isn't the
best option for writing highly performant code. I was kind of hoping
that after the first run, cake was compiling the code and loading the
classes into the running JVM to avoid most of the overhead of a fresh
startup, but I guess it is instead executing it as a script or
something. Good to know!

Thanks.

On Jul 8, 5:44 pm, Benny Tsai  wrote:
> Hi Christopher,
>
> I ran your code with only one modification, using the "time" macro to
> measure the execution time of the mapper function itself:
>
> (use ['clojure.java.io :only '(reader)])
> (use ['clojure.string :only '(split)])
>
> (defn mapper [lines]
>   (doseq [line lines]
>     (doseq [word (split line #"\s+")]
>       (println (str word "\t1")
>
> (time (mapper (line-seq (reader *in*
>
> Processing a file that contained 1 copy of the "Hound of Baskerville" text
> took 1.9 seconds.
> Processing a file that contained 2 copies of the text took 2.8 seconds.
> Processing a file that contained 4 copies of the text took 3.8 seconds.
>
> I did not use cake, but ran mapper.clj via a direct call to the java
> executable.  So I think the times you're seeing is due to either Cake or the
> way the timing is done.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Help on a Clojure performance question

2011-07-08 Thread David Nolen
Here's a very ugly low-level version just to show that it can be done:

(ns clj-play.mapper
  (:use [clojure.java.io :only [reader]])
  (:use [clojure.string :only [split]])
  (:gen-class))

(set! *warn-on-reflection* true)

(defn mapper [^java.io.BufferedReader r ^java.io.OutputStreamWriter out]
  (loop [^String line (.readLine r)]
(when line
  (doseq [^String word (.split line "\\s+")]
(.append out (.concat word "\t1\n"))
(.flush out))
  (recur (.readLine r)

(defn -main
  []
  (mapper (reader *in*) *out*))

I see that the Python version and the Clojure version are identical ~14.7-8s
for 20 copies of the text so this looks like it's pretty much IO bound at
this point.

David

On Fri, Jul 8, 2011 at 9:04 PM, David Nolen  wrote:

> Running a program like that with cake run is awful, use AOT:
>
> (ns clj-play.mapper
>   (:use [clojure.java.io :only [reader]])
>   (:use [clojure.string :only [split]])
>   (:gen-class))
>
> (defn mapper [lines]
>   (doseq [line lines]
> (doseq [word (split line #"\s+")]
>(println (str word "\t1")
>
> (defn -main
>   []
>   (mapper (line-seq (reader *in*
>
> Run with something like:
>
> time java -server -cp ./classes:lib/clojure-1.3.0-beta1.jar foo.mapper <
> input.txt
>
> I see that this takes around 16s w/ 20 copies of the text. Python is 13s
> seconds. Use some lower level Java facilities and you'll likely trounce the
> Python.
>
> David
>
> On Fri, Jul 8, 2011 at 7:05 PM, Christopher  wrote:
>
>> Hi all,
>>
>> I have recently been watching a set of videos from O'Reilly on
>> MapReduce. The author of the series is using Python for all of the
>> examples, but, in an effort to use Clojure more, I've been following
>> along and writing my code in Clojure. When I implemented the mapper
>> function that he described in both languages, I noticed that the
>> Python version was running quite a bit faster and I was wondering if
>> you all could help me understand why that is the case. I've pasted the
>> code for each solution below. Also, I am using cake to run the Clojure
>> code so my thoughts are, since it keeps a JVM up and running at all
>> times, that should remove the JVM startup time from the equation. The
>> input file that I am using is the Hound of the Baskervilles from
>> Project Guttenberg (http://www.gutenberg.org/cache/epub/2852/
>> pg2852.txt). I've also noticed that with an even longer text as input
>> (for example, I copied the text of the input.txt 10 times into a file)
>> the Clojure code slows significantly more. In some cases I had to just
>> stop the code with a Ctrl-c. Any ideas you all have on what could be
>> causing this would be great. I'm not trying to start any battles
>> between Python and Clojure, as I love them both, I'm strictly trying
>> to learn how to be a better programmer in Clojure.
>>
>> Thanks ahead of time for any help you all can give.
>>
>> Christopher
>>
>> ;; mapper.clj
>>
>> (use ['clojure.java.io :only '(reader)])
>> (use ['clojure.string :only '(split)])
>>
>> (defn mapper [lines]
>>  (doseq [line lines]
>>(doseq [word (split line #"\s+")]
>>  (println (str word "\t1")
>>
>> (mapper (line-seq (reader *in*)))
>>
>>
>> I am running the code above with the following command and I get the
>> output below
>>
>> % time cake run mapper.clj < input.txt
>> real0m3.573s
>> user0m2.031s
>> sys 0m1.528s
>>
>>
>> # mapper.py
>>
>> #!/usr/bin/env
>> python
>>
>> import sys
>>
>> def mapper(lines):
>>for line in lines:
>>words = line.split()
>>for word in words:
>>print "{0}\t1".format(word)
>>
>> def main():
>>mapper(sys.stdin)
>>
>> if __name__ == '__main__':
>>main()
>>
>> % time mapper.py < input.txt
>> real0m0.661s
>> user0m0.105s
>> sys 0m0.083s
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to clojure@googlegroups.com
>> Note that posts from new members are moderated - please be patient with
>> your first post.
>> To unsubscribe from this group, send email to
>> clojure+unsubscr...@googlegroups.com
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>
>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Help on a Clojure performance question

2011-07-08 Thread David Nolen
Running a program like that with cake run is awful, use AOT:

(ns clj-play.mapper
  (:use [clojure.java.io :only [reader]])
  (:use [clojure.string :only [split]])
  (:gen-class))

(defn mapper [lines]
  (doseq [line lines]
(doseq [word (split line #"\s+")]
   (println (str word "\t1")

(defn -main
  []
  (mapper (line-seq (reader *in*

Run with something like:

time java -server -cp ./classes:lib/clojure-1.3.0-beta1.jar foo.mapper <
input.txt

I see that this takes around 16s w/ 20 copies of the text. Python is 13s
seconds. Use some lower level Java facilities and you'll likely trounce the
Python.

David

On Fri, Jul 8, 2011 at 7:05 PM, Christopher  wrote:

> Hi all,
>
> I have recently been watching a set of videos from O'Reilly on
> MapReduce. The author of the series is using Python for all of the
> examples, but, in an effort to use Clojure more, I've been following
> along and writing my code in Clojure. When I implemented the mapper
> function that he described in both languages, I noticed that the
> Python version was running quite a bit faster and I was wondering if
> you all could help me understand why that is the case. I've pasted the
> code for each solution below. Also, I am using cake to run the Clojure
> code so my thoughts are, since it keeps a JVM up and running at all
> times, that should remove the JVM startup time from the equation. The
> input file that I am using is the Hound of the Baskervilles from
> Project Guttenberg (http://www.gutenberg.org/cache/epub/2852/
> pg2852.txt). I've also noticed that with an even longer text as input
> (for example, I copied the text of the input.txt 10 times into a file)
> the Clojure code slows significantly more. In some cases I had to just
> stop the code with a Ctrl-c. Any ideas you all have on what could be
> causing this would be great. I'm not trying to start any battles
> between Python and Clojure, as I love them both, I'm strictly trying
> to learn how to be a better programmer in Clojure.
>
> Thanks ahead of time for any help you all can give.
>
> Christopher
>
> ;; mapper.clj
>
> (use ['clojure.java.io :only '(reader)])
> (use ['clojure.string :only '(split)])
>
> (defn mapper [lines]
>  (doseq [line lines]
>(doseq [word (split line #"\s+")]
>  (println (str word "\t1")
>
> (mapper (line-seq (reader *in*)))
>
>
> I am running the code above with the following command and I get the
> output below
>
> % time cake run mapper.clj < input.txt
> real0m3.573s
> user0m2.031s
> sys 0m1.528s
>
>
> # mapper.py
>
> #!/usr/bin/env
> python
>
> import sys
>
> def mapper(lines):
>for line in lines:
>words = line.split()
>for word in words:
>print "{0}\t1".format(word)
>
> def main():
>mapper(sys.stdin)
>
> if __name__ == '__main__':
>main()
>
> % time mapper.py < input.txt
> real0m0.661s
> user0m0.105s
> sys 0m0.083s
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Help on a Clojure performance question

2011-07-08 Thread Benny Tsai
Hi Christopher,

I ran your code with only one modification, using the "time" macro to 
measure the execution time of the mapper function itself:

(use ['clojure.java.io :only '(reader)])
(use ['clojure.string :only '(split)])

(defn mapper [lines]
  (doseq [line lines]
(doseq [word (split line #"\s+")]
  (println (str word "\t1")

(time (mapper (line-seq (reader *in*

Processing a file that contained 1 copy of the "Hound of Baskerville" text 
took 1.9 seconds.
Processing a file that contained 2 copies of the text took 2.8 seconds.
Processing a file that contained 4 copies of the text took 3.8 seconds.

I did not use cake, but ran mapper.clj via a direct call to the java 
executable.  So I think the times you're seeing is due to either Cake or the 
way the timing is done.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Help on a Clojure performance question

2011-07-08 Thread Christopher
Hi Ken,

Thanks for the comment. I tried what you suggested, but I am not
getting any reflection warnings. That said, comments like this are
exactly what I am looking for; I had no idea that you could turn on
checking for reflection issues. I'd love it if I could find a way to
speed this piece of code up, but, at the end of the day, what I am
really interested in is learning all the different ways that good
Clojure programmers go about analyzing their code for performance
issues. So, thanks a bunch for the tip and, please, keep them coming.

Christopher

On Jul 8, 4:17 pm, Ken Wesson  wrote:
> On Fri, Jul 8, 2011 at 7:05 PM, Christopher  wrote:
> > ;; mapper.clj
>
> > (use ['clojure.java.io :only '(reader)])
> > (use ['clojure.string :only '(split)])
>
> > (defn mapper [lines]
> >  (doseq [line lines]
> >    (doseq [word (split line #"\s+")]
> >      (println (str word "\t1")
>
> > (mapper (line-seq (reader *in*)))
>
> Try (set! *warn-on-reflection* true) at your REPL and then evaluating
> the above. One of the commonest causes of slow Clojure performance is
> reflection, which can generally be avoided with judicious application
> of type hints.
>
> --
> Protege: What is this seething mass of parentheses?!
> Master: Your father's Lisp REPL. This is the language of a true
> hacker. Not as clumsy or random as C++; a language for a more
> civilized age.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Help on a Clojure performance question

2011-07-08 Thread Alan Malloy
On Jul 8, 4:17 pm, Ken Wesson  wrote:
> On Fri, Jul 8, 2011 at 7:05 PM, Christopher  wrote:
> > ;; mapper.clj
>
> > (use ['clojure.java.io :only '(reader)])
> > (use ['clojure.string :only '(split)])
>
> > (defn mapper [lines]
> >  (doseq [line lines]
> >    (doseq [word (split line #"\s+")]
> >      (println (str word "\t1")
>
> > (mapper (line-seq (reader *in*)))
>
> Try (set! *warn-on-reflection* true) at your REPL and then evaluating
> the above. One of the commonest causes of slow Clojure performance is
> reflection, which can generally be avoided with judicious application
> of type hints.
>

There's no interop here, so I 100% guarantee he doesn't have any
reflection warnings.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Help on a Clojure performance question

2011-07-08 Thread Christopher
Hi Michael,

Thanks for the comments, though, I want to point out that I'm using
cake to run the program which keeps an instance of the JVM spun up at
all times. That should remove the startup time, unless I am
misunderstanding how cake works. Also, the startup time should be
constant (say a few seconds or so) so that wouldn't account for why
the program becomes "exponentially" slower when I increase the amount
of text that is processes.

Christopher

On Jul 8, 4:19 pm, Michael Klishin 
wrote:
> 2011/7/9 Christopher 
>
> > % time cake run mapper.clj < input.txt
> > real    0m3.573s
> > user    0m2.031s
> > sys     0m1.528s
>
> These numbers include JVM startup overhead (which is significant compared to
> Python startup overhead).
>
> --
> MK
>
> http://github.com/michaelklishinhttp://twitter.com/michaelklishin

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Help on a Clojure performance question

2011-07-08 Thread Michael Klishin
2011/7/9 Christopher 

> % time cake run mapper.clj < input.txt
> real0m3.573s
> user0m2.031s
> sys 0m1.528s
>

These numbers include JVM startup overhead (which is significant compared to
Python startup overhead).

-- 
MK

http://github.com/michaelklishin
http://twitter.com/michaelklishin

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Help on a Clojure performance question

2011-07-08 Thread Ken Wesson
On Fri, Jul 8, 2011 at 7:05 PM, Christopher  wrote:
> ;; mapper.clj
>
> (use ['clojure.java.io :only '(reader)])
> (use ['clojure.string :only '(split)])
>
> (defn mapper [lines]
>  (doseq [line lines]
>    (doseq [word (split line #"\s+")]
>      (println (str word "\t1")
>
> (mapper (line-seq (reader *in*)))

Try (set! *warn-on-reflection* true) at your REPL and then evaluating
the above. One of the commonest causes of slow Clojure performance is
reflection, which can generally be avoided with judicious application
of type hints.

-- 
Protege: What is this seething mass of parentheses?!
Master: Your father's Lisp REPL. This is the language of a true
hacker. Not as clumsy or random as C++; a language for a more
civilized age.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Help on a Clojure performance question

2011-07-08 Thread Christopher
Hi all,

I have recently been watching a set of videos from O'Reilly on
MapReduce. The author of the series is using Python for all of the
examples, but, in an effort to use Clojure more, I've been following
along and writing my code in Clojure. When I implemented the mapper
function that he described in both languages, I noticed that the
Python version was running quite a bit faster and I was wondering if
you all could help me understand why that is the case. I've pasted the
code for each solution below. Also, I am using cake to run the Clojure
code so my thoughts are, since it keeps a JVM up and running at all
times, that should remove the JVM startup time from the equation. The
input file that I am using is the Hound of the Baskervilles from
Project Guttenberg (http://www.gutenberg.org/cache/epub/2852/
pg2852.txt). I've also noticed that with an even longer text as input
(for example, I copied the text of the input.txt 10 times into a file)
the Clojure code slows significantly more. In some cases I had to just
stop the code with a Ctrl-c. Any ideas you all have on what could be
causing this would be great. I'm not trying to start any battles
between Python and Clojure, as I love them both, I'm strictly trying
to learn how to be a better programmer in Clojure.

Thanks ahead of time for any help you all can give.

Christopher

;; mapper.clj

(use ['clojure.java.io :only '(reader)])
(use ['clojure.string :only '(split)])

(defn mapper [lines]
  (doseq [line lines]
(doseq [word (split line #"\s+")]
  (println (str word "\t1")

(mapper (line-seq (reader *in*)))


I am running the code above with the following command and I get the
output below

% time cake run mapper.clj < input.txt
real0m3.573s
user0m2.031s
sys 0m1.528s


# mapper.py

#!/usr/bin/env
python

import sys

def mapper(lines):
for line in lines:
words = line.split()
for word in words:
print "{0}\t1".format(word)

def main():
mapper(sys.stdin)

if __name__ == '__main__':
main()

% time mapper.py < input.txt
real0m0.661s
user0m0.105s
sys 0m0.083s

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en