Hm. Interesting. For the record, the exact code I'm running right now that
I'm seeing great parallelism with is this:
(defn reverse-recursively [coll]
(loop [[r & more :as all] (seq coll)
acc '()]
(if all
(recur more (cons r acc))
acc)))
(defn burn
([] (loop [i 0
value '()]
(if (>= i 10000)
(count (last (take 10000 (iterate reverse-recursively value))))
(recur (inc i)
(cons
(* (int i)
(+ (float i)
(- (int i)
(/ (float i)
(inc (int i))))))
value)))))
([_] (burn)))
(defn pmapall
"Like pmap but: 1) coll should be finite, 2) the returned sequence
will not be lazy, 3) calls to f may occur in any order, to maximize
multicore processor utilization, and 4) takes only one coll so far."
[f coll]
(let [agents (map agent coll)]
(dorun (map #(send % f) agents))
(apply await agents)
(doall (map deref agents))))
(defn -main
[& args]
(time ( doall( pmapall burn (range 100))))
(System/exit 0))
On Tue, Dec 11, 2012 at 3:00 PM, Andy Fingerhut <[email protected]>wrote:
> Marshall:
>
> I'm not practiced in recognizing megamorphic call sites, so I could be
> missing some in the example code below, modified from Lee's original code.
> It doesn't use reverse or conj, and as far as I can tell doesn't use
> PersistentList, either, only Cons.
>
> (defn burn-cons [size]
> (let [size (long size)]
> (loop [i (long 0)
> value nil]
> (if (>= i size)
> (last value)
> (recur (inc i) (clojure.lang.Cons.
> (* (int i)
> (+ (float i)
> (- (int i)
> (/ (float i)
> (inc (int i))))))
> value))))))
>
> (a) invoke (burn-cons 2000000) sequentially 64 times in a single JVM
>
> (b) invoke (burn-cons 2000000) 64 times using a modified version of pmap
> that limits the number of active threads to 2 (see below), in a single JVM.
> I would hope that this would take about half the elapsed time than (a),
> but the elapsed time is longer than (a)
>
> (c) start up two JVMs simultaneously and invoke (burn-cons 2000000)
> sequentially 32 times in each. The elapsed time here is less than (a), as
> I would expect.
>
> (Clojure 1.4, Oracle/Apple JDK 1.6.0_37, Mac OS X 10.6.8, running on a
> machine with Intel core i7 with 4 physical cores but OS X reports it as 8 I
> think because of 2 hyperthreads per core -- more details available on
> request).
>
> Can you try to reproduce to see if you get similar results? If so, do you
> know why we get bad parallelism in a single JVM for this code? If there
> are no megamorphic call sites, then it is examples like this that lead me
> to wonder about locking in memory allocation and/or GC.
>
>
> With the functions below, my part (b) was measured by doing:
>
> (time (doall (nthreads-pmap 2 (burn-cons 2000000) (unchunk (range 64)))))
>
> Andy
>
>
>
> (defn unchunk [s]
> (when (seq s)
> (lazy-seq
> (cons (first s)
> (unchunk (next s))))))
>
> (defn nthreads-pmap
> "Like pmap, except can take an argument nthreads to control the
> maximum number of parallel threads used."
> ([f coll]
> (let [n (+ 2 (.. Runtime getRuntime availableProcessors))]
> (nthreads-pmap n f coll)))
> ([nthreads f coll]
> (if (= nthreads 1)
> (map f coll)
> (let [n (dec nthreads)
> rets (map #(future (f %)) coll)
> step (fn step [[x & xs :as vs] fs]
> (lazy-seq
> (if-let [s (seq fs)]
> (cons (deref x) (step xs (rest s)))
> (map deref vs))))]
> (step rets (drop n rets)))))
> ([nthreads f coll & colls]
> (let [step (fn step [cs]
> (lazy-seq
> (let [ss (map seq cs)]
> (when (every? identity ss)
> (cons (map first ss) (step (map rest ss)))))))]
> (nthreads-pmap nthreads #(apply f %) (step (cons coll colls))))))
>
>
> On Dec 11, 2012, at 10:06 AM, Marshall Bockrath-Vandegrift wrote:
>
> > Lee Spector <[email protected]> writes:
> >
> >> If the application does lots of "list processing" but does so with a
> >> mix of Clojure list and sequence manipulation functions, then one
> >> would have to write private, list/cons-only versions of all of these
> >> things? That is -- overstating it a bit, to be sure, but perhaps not
> >> entirely unfairly -- re-implement Clojure's Lisp?
> >
> > I just did a quick look over clojure/core.clj, and `reverse` is the only
> > function which stood out to me as hitting the most pathological case.
> > Every other `conj` loop over a user-provided datastructure is `conj`ing
> > into an explicit non-list/`Cons` type.
> >
> > So I think if you replace your calls to `reverse` and any `conj` loops
> > you have in your own code, you should see a perfectly reasonable
> > speedup.
> >
> > -Marshall
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Clojure" group.
> > To post to this group, send email to [email protected]
> > Note that posts from new members are moderated - please be patient with
> your first post.
> > To unsubscribe from this group, send email to
> > [email protected]
> > For more options, visit this group at
> > http://groups.google.com/group/clojure?hl=en
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to [email protected]
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
>
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en