Hm. Interesting. For the record, the exact code I'm running right now that
I'm seeing great parallelism with is this:

(defn reverse-recursively [coll]
  (loop [[r & more :as all] (seq coll)
         acc '()]
    (if all
      (recur more (cons r acc))

(defn burn
  ([] (loop [i 0
             value '()]
        (if (>= i 10000)
          (count (last (take 10000 (iterate reverse-recursively value))))
          (recur (inc i)
                   (* (int i)
                      (+ (float i)
                         (- (int i)
                            (/ (float i)
                               (inc (int i))))))
  ([_] (burn)))

(defn pmapall
  "Like pmap but: 1) coll should be finite, 2) the returned sequence
   will not be lazy, 3) calls to f may occur in any order, to maximize
   multicore processor utilization, and 4) takes only one coll so far."
  [f coll]
  (let [agents (map agent coll)]
    (dorun (map #(send % f) agents))
    (apply await agents)
    (doall (map deref agents))))

(defn -main
  [& args]
  (time ( doall( pmapall burn (range 100))))
  (System/exit 0))

On Tue, Dec 11, 2012 at 3:00 PM, Andy Fingerhut <>wrote:

> Marshall:
> I'm not practiced in recognizing megamorphic call sites, so I could be
> missing some in the example code below, modified from Lee's original code.
>  It doesn't use reverse or conj, and as far as I can tell doesn't use
> PersistentList, either, only Cons.
> (defn burn-cons [size]
>   (let [size (long size)]
>     (loop [i (long 0)
>            value nil]
>       (if (>= i size)
>         (last value)
>         (recur (inc i) (clojure.lang.Cons.
>                         (* (int i)
>                            (+ (float i)
>                               (- (int i)
>                                  (/ (float i)
>                                     (inc (int i))))))
>                         value))))))
> (a) invoke (burn-cons 2000000) sequentially 64 times in a single JVM
> (b) invoke (burn-cons 2000000) 64 times using a modified version of pmap
> that limits the number of active threads to 2 (see below), in a single JVM.
>  I would hope that this would take about half the elapsed time than (a),
> but the elapsed time is longer than (a)
> (c) start up two JVMs simultaneously and invoke (burn-cons 2000000)
> sequentially 32 times in each.  The elapsed time here is less than (a), as
> I would expect.
> (Clojure 1.4, Oracle/Apple JDK 1.6.0_37, Mac OS X 10.6.8, running on a
> machine with Intel core i7 with 4 physical cores but OS X reports it as 8 I
> think because of 2 hyperthreads per core -- more details available on
> request).
> Can you try to reproduce to see if you get similar results?  If so, do you
> know why we get bad parallelism in a single JVM for this code?  If there
> are no megamorphic call sites, then it is examples like this that lead me
> to wonder about locking in memory allocation and/or GC.
> With the functions below, my part (b) was measured by doing:
> (time (doall (nthreads-pmap 2 (burn-cons 2000000) (unchunk (range 64)))))
> Andy
> (defn unchunk [s]
>   (when (seq s)
>     (lazy-seq
>      (cons (first s)
>            (unchunk (next s))))))
> (defn nthreads-pmap
>   "Like pmap, except can take an argument nthreads to control the
>   maximum number of parallel threads used."
>   ([f coll]
>      (let [n (+ 2 (.. Runtime getRuntime availableProcessors))]
>        (nthreads-pmap n f coll)))
>   ([nthreads f coll]
>      (if (= nthreads 1)
>        (map f coll)
>        (let [n (dec nthreads)
>              rets (map #(future (f %)) coll)
>              step (fn step [[x & xs :as vs] fs]
>                     (lazy-seq
>                      (if-let [s (seq fs)]
>                        (cons (deref x) (step xs (rest s)))
>                        (map deref vs))))]
>          (step rets (drop n rets)))))
>   ([nthreads f coll & colls]
>    (let [step (fn step [cs]
>                 (lazy-seq
>                  (let [ss (map seq cs)]
>                    (when (every? identity ss)
>                      (cons (map first ss) (step (map rest ss)))))))]
>      (nthreads-pmap nthreads #(apply f %) (step (cons coll colls))))))
> On Dec 11, 2012, at 10:06 AM, Marshall Bockrath-Vandegrift wrote:
> > Lee Spector <> writes:
> >
> >> If the application does lots of "list processing" but does so with a
> >> mix of Clojure list and sequence manipulation functions, then one
> >> would have to write private, list/cons-only versions of all of these
> >> things? That is -- overstating it a bit, to be sure, but perhaps not
> >> entirely unfairly -- re-implement Clojure's Lisp?
> >
> > I just did a quick look over clojure/core.clj, and `reverse` is the only
> > function which stood out to me as hitting the most pathological case.
> > Every other `conj` loop over a user-provided datastructure is `conj`ing
> > into an explicit non-list/`Cons` type.
> >
> > So I think if you replace your calls to `reverse` and any `conj` loops
> > you have in your own code, you should see a perfectly reasonable
> > speedup.
> >
> > -Marshall
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Clojure" group.
> > To post to this group, send email to
> > Note that posts from new members are moderated - please be patient with
> your first post.
> > To unsubscribe from this group, send email to
> >
> > For more options, visit this group at
> >
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> For more options, visit this group at

You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
For more options, visit this group at

Reply via email to