I did a bit more torture testing of clojure 1.2 and those JVM settings this time for speed.
user=> (time (domany 100000 (repeatedly gensym))) "Elapsed time: 1362.33344 msecs" {:foo clojure.lang.Symbol} user=> (time (pmap #(assoc % :bar 1) (take 10 (repeatedly #(domany 10000 (repeatedly gensym)))))) "Elapsed time: 210.43172 msecs" ({:bar 1, :foo clojure.lang.Symbol} {:bar 1, ... yadda yadda yadda (This is after a couple of prior runs to JIT everything.) This is interesting. The machine's only dual core; the roughly 6x speedup therefore tells me that repeated gensymming is not CPU-bound. On the other hand I would expect no speedup at all if gensymming wasn't CPU bound because it spent lots of time waiting on a lock on some counter used to generate the next gensym's numerical part while avoiding collisions. Ten threads wouldn't be able to generate gensyms any faster than one thread, in that case, unless there was a long wait for something other than a global lock in gensym. I tried testing this because I suspected that runtime use of gensym, besides leaking permgen, might create a bottleneck at a global lock if done in a concurrent app; seems that's not the case at least up to a parallelism factor of 10, for whatever reason. Changing (take 10 (repeatedly #(...))) to (repeat 10 (...)) results in a 2x further speedup as well as the gensym operation being done only 10,000 times instead of 100,000. So 90,000 gensyms done in parallel takes ~100ms, 10,000 would take ~110, and the other ~100ms is consumed by pmap overhead. Compared with user=> (time (domany 90000 (repeatedly gensym))) "Elapsed time: 1164.5704 msecs" that's a nearly 12x speedup from parallelism after adjusting for pmap's overhead! The amount of CPU spent on gensym creation per ms can have doubled at most, so again most of the time is spent waiting on something and since this is certainly not I/O bound -- no printing occurs until after the part that's timed has been timed -- it's got to be locks and synchronization of some sort (unless there's a gratuitous Thread/sleep buried in clojure somewhere, which seems highly unlikely), but it can't be a global lock in gensym creation or parallelization wouldn't produce any speedup to speak of. (If there's a global lock, most of the time must be spent elsewhere than waiting on that particular lock, or all 10 threads would spend most of their time queued up on that one lock and wouldn't get things done any faster than one thread would.) user=> (time (pmap #(assoc % :bar 1) (take 5 (repeatedly #(domany 20000 (repeatedly gensym)))))) "Elapsed time: 555.41208 msecs" The speedup is only 2x instead of 6x with half as many parallel threads. Subtracting the estimated 100ms pmap overhead gives 455 which makes the speedup closer to 3x (vs. 12x with 10 threads). Quintupling the thread count from 1 yields a 3x speedup but a further doubling yields a 4x speedup? That seems strange. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en