Adding to this thread from almost a year ago.  I don't have conclusive
proof with experiments to show right now, but I do have some experiments
that have led me to what I think is a plausible cause of not just Clojure
programs running more slowly when multi-threaded than when single-threaded,
but any programs running on JVM's memory model doing so.  Qualification: My
explanation would be true only for multi-threaded programs running on the
JVM that store significant amounts of data in memory, even if the data
written by each thread is only read by that thread, and there is no locking
or other inter-thread communication, i.e. for "embarrasingly parallel"
problems.

Outline of the argument:

When you start a thread, and then wait for the thread to complete, the JVM
memory model requires all loads and stores to satisfy certain
restrictions.  One of these is that any store done before the thread is
created should 'happen before' the thread start, and thus the updated
stored values must be visible to the new thread.  'Visible' here means that
the thread doing the store must cause the CPU it is running on to update
main memory from whatever locally modified values it has written into its
local cache.  That rule isn't so relevant to my argument.

The one that is relevant is that any store performed by the thread is
considered to 'happen before' a join operation on the thread.  Thus any
store done by a thread must be written back to main memory, *even if the
store is to a JVM object that later becomes garbage*.

So imagine a single-threaded program that creates X bytes of garbage while
it runs.  Those X bytes will definitely be written to the CPU's local
cache, but they will only be written to main memory if the cache space runs
out before the garbage collector does its work and allows that memory to be
reused for allocations.  The CPU-to-local-cache bandwidth in many modern
systems is significantly faster than local-cache-to-main-memory bandwidth.

Now take that same program and spread its work across 2 or more threads,
with a join at the end of each one.  For the sake of example, say that each
thread will write X/N bytes of data while it runs.  Even if the only data
needed later in the rest of the program is a single Long object, for
example, all of those X/N bytes of data will be copied from the local cache
to main memory (if that did not already happen before the thread
terminated).

If the number of threads is large enough, the amount of data written from
all local caches to main memory can be higher in the multi-threaded case
than in the single-threaded case.

Anyway, that is my hypothesis about what could be happening here.  It isn't
Clojure-specific, but it can be exacerbated by the common behavior of a lot
of Clojure code to allocate significant amounts of memory that becomes
garbage.

Andy



On Wed, Jan 30, 2013 at 6:20 PM, Lee Spector <lspec...@hampshire.edu> wrote:

>
> FYI we had a bit of a discussion about this at a meetup in Amherst MA
> yesterday, and while I'm not sufficiently on top of the JVM or system
> issues to have briefed everyone on all of the details there has been a
> little of followup since the discussion, including results of some
> different experiments by Chas Emerick, at:
> http://www.meetup.com/Functional-Programming-Connoisseurs/messages/boards/thread/30946382
>
>  -Lee
>
> On Jan 30, 2013, at 8:39 PM, Marshall Bockrath-Vandegrift wrote:
> >
> > Apologies for my very-slow reply here.  I keep thinking that I’ll have
> > more time to look into this issue, and keep having other things
> > requiring my attention.  And on top of that, I’ve temporarily lost the
> > many-way AMD system I was using as a test-bed.
> >
> > I very much want to see if I can get my hands on an Intel system to
> > compare to.  My AMD system is in theory 32-way – two physical CPUs, each
> > with 16 cores.  However, Linux reports (via /proc/cpuinfo) the cores in
> > groups of 8 (“cpu cores : 8” etc).  And something very strange happens
> > when extending parallelism beyond 8-way...  I ran several experiments
> > using a version of your whole-application benchmark I modified to
> > control the level of parallelism.  At parallelism 9+, the real time it
> > takes to complete the benchmark hardly budges, but the user/CPU time
> > increases linearly with the level of parallelism!  As far as I can tell,
> > multi-processor AMD *is* a NUMA architecture, which might potentially
> > explain things.  But enabling the JVM NUMA options doesn’t seem to
> > affect the benchmark.
> >
> > I think next steps are two-fold: (1) examine parallelism vs real & CPU
> > time on an Intel system, and (2) attempt to reproduce the observed
> > behavior in pure Java.  I’m keeping my fingers crossed that I’ll have
> > some time to look at this more soon, but I’m honestly not very hopeful.
> >
> > In the mean time, I hope you’ve managed to exploit multi-process
> > parallelism to run more efficiently?
> >
> > -Marshall
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to