Hm. Interesting. For the record, the exact code I'm running right now that I'm seeing great parallelism with is this:
(defn reverse-recursively [coll] (loop [[r & more :as all] (seq coll) acc '()] (if all (recur more (cons r acc)) acc))) (defn burn ([] (loop [i 0 value '()] (if (>= i 10000) (count (last (take 10000 (iterate reverse-recursively value)))) (recur (inc i) (cons (* (int i) (+ (float i) (- (int i) (/ (float i) (inc (int i)))))) value))))) ([_] (burn))) (defn pmapall "Like pmap but: 1) coll should be finite, 2) the returned sequence will not be lazy, 3) calls to f may occur in any order, to maximize multicore processor utilization, and 4) takes only one coll so far." [f coll] (let [agents (map agent coll)] (dorun (map #(send % f) agents)) (apply await agents) (doall (map deref agents)))) (defn -main [& args] (time ( doall( pmapall burn (range 100)))) (System/exit 0)) On Tue, Dec 11, 2012 at 3:00 PM, Andy Fingerhut <andy.finger...@gmail.com>wrote: > Marshall: > > I'm not practiced in recognizing megamorphic call sites, so I could be > missing some in the example code below, modified from Lee's original code. > It doesn't use reverse or conj, and as far as I can tell doesn't use > PersistentList, either, only Cons. > > (defn burn-cons [size] > (let [size (long size)] > (loop [i (long 0) > value nil] > (if (>= i size) > (last value) > (recur (inc i) (clojure.lang.Cons. > (* (int i) > (+ (float i) > (- (int i) > (/ (float i) > (inc (int i)))))) > value)))))) > > (a) invoke (burn-cons 2000000) sequentially 64 times in a single JVM > > (b) invoke (burn-cons 2000000) 64 times using a modified version of pmap > that limits the number of active threads to 2 (see below), in a single JVM. > I would hope that this would take about half the elapsed time than (a), > but the elapsed time is longer than (a) > > (c) start up two JVMs simultaneously and invoke (burn-cons 2000000) > sequentially 32 times in each. The elapsed time here is less than (a), as > I would expect. > > (Clojure 1.4, Oracle/Apple JDK 1.6.0_37, Mac OS X 10.6.8, running on a > machine with Intel core i7 with 4 physical cores but OS X reports it as 8 I > think because of 2 hyperthreads per core -- more details available on > request). > > Can you try to reproduce to see if you get similar results? If so, do you > know why we get bad parallelism in a single JVM for this code? If there > are no megamorphic call sites, then it is examples like this that lead me > to wonder about locking in memory allocation and/or GC. > > > With the functions below, my part (b) was measured by doing: > > (time (doall (nthreads-pmap 2 (burn-cons 2000000) (unchunk (range 64))))) > > Andy > > > > (defn unchunk [s] > (when (seq s) > (lazy-seq > (cons (first s) > (unchunk (next s)))))) > > (defn nthreads-pmap > "Like pmap, except can take an argument nthreads to control the > maximum number of parallel threads used." > ([f coll] > (let [n (+ 2 (.. Runtime getRuntime availableProcessors))] > (nthreads-pmap n f coll))) > ([nthreads f coll] > (if (= nthreads 1) > (map f coll) > (let [n (dec nthreads) > rets (map #(future (f %)) coll) > step (fn step [[x & xs :as vs] fs] > (lazy-seq > (if-let [s (seq fs)] > (cons (deref x) (step xs (rest s))) > (map deref vs))))] > (step rets (drop n rets))))) > ([nthreads f coll & colls] > (let [step (fn step [cs] > (lazy-seq > (let [ss (map seq cs)] > (when (every? identity ss) > (cons (map first ss) (step (map rest ss)))))))] > (nthreads-pmap nthreads #(apply f %) (step (cons coll colls)))))) > > > On Dec 11, 2012, at 10:06 AM, Marshall Bockrath-Vandegrift wrote: > > > Lee Spector <lspec...@hampshire.edu> writes: > > > >> If the application does lots of "list processing" but does so with a > >> mix of Clojure list and sequence manipulation functions, then one > >> would have to write private, list/cons-only versions of all of these > >> things? That is -- overstating it a bit, to be sure, but perhaps not > >> entirely unfairly -- re-implement Clojure's Lisp? > > > > I just did a quick look over clojure/core.clj, and `reverse` is the only > > function which stood out to me as hitting the most pathological case. > > Every other `conj` loop over a user-provided datastructure is `conj`ing > > into an explicit non-list/`Cons` type. > > > > So I think if you replace your calls to `reverse` and any `conj` loops > > you have in your own code, you should see a perfectly reasonable > > speedup. > > > > -Marshall > > > > -- > > You received this message because you are subscribed to the Google > > Groups "Clojure" group. > > To post to this group, send email to clojure@googlegroups.com > > Note that posts from new members are moderated - please be patient with > your first post. > > To unsubscribe from this group, send email to > > clojure+unsubscr...@googlegroups.com > > For more options, visit this group at > > http://groups.google.com/group/clojure?hl=en > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en