Re: Poor parallelization performance across 18 cores (but not 4)

2015-11-20 Thread David Iba
taking the time to run these >> experiments. Similar performance issues with many threads in the same JVM >> on a many-core machine have come up before in the past, and so far I don't >> know if anyone has gotten to the bottom of it yet. >> >> Andy >> >>

Re: Poor parallelization performance across 18 cores (but not 4)

2015-11-19 Thread David Iba
t a chance. On Thu, Nov 19, 2015 at 6:19 PM, Fluid Dynamics <a2093...@trbvm.com> wrote: > On Thursday, November 19, 2015 at 1:36:59 AM UTC-5, David Iba wrote: >> >> OK, have a few updates to report: >> >>- Oracle vs OpenJDK did not make a difference >

Re: Poor parallelization performance across 18 cores (but not 4)

2015-11-18 Thread David Iba
lly carves it up into the small pieces it needs for each individual >>> Java 'new' allocation, or gets a global lock for every 'new'. The latter >>> would give terrible performance as # cores increase, but I don't know how >>> to tell whether that is the case, except

Re: Poor parallelization performance across 18 cores (but not 4)

2015-11-18 Thread David Iba
" and "swap!" to "vswap!". See if that changes > anything. > > Timothy > > > On Wed, Nov 18, 2015 at 9:00 AM, David Iba <davi...@gmail.com > > wrote: > >> Timothy: Each thread (call of f2) creates its own "local" atom, so I >>

Re: Poor parallelization performance across 18 cores (but not 4)

2015-11-18 Thread David Iba
ere a couple of those rogue threads that took 2-3X the time of the others. Any ideas? On Thursday, November 19, 2015 at 1:08:14 AM UTC+9, David Iba wrote: > > No worries. Thanks, I'll give that a try as well! > > On Thursday, November 19, 2015 at 1:04:04 AM UTC+9, tbc++ wrote: &

Re: Poor parallelization performance across 18 cores (but not 4)

2015-11-17 Thread David Iba
17, 2015 at 6:38:39 AM UTC+1, David Iba wrote: >> >> I have functions f1 and f2 below, and let's say they run in T1 and T2 >> amount of time when running a single instance/thread. The issue I'm facing >> is that parallelizing f2 across 18 cores takes anywhere from 2-5

Re: Poor parallelization performance across 18 cores (but not 4)

2015-11-17 Thread David Iba
correction: that "do" should be a "doall". (My actual test code was a bit different, but each run printed some info when it started so it doesn't have to do with delayed evaluation of lazy seq's or anything). On Tuesday, November 17, 2015 at 6:49:16 PM UTC+9, David

Poor parallelization performance across 18 cores (but not 4)

2015-11-16 Thread David Iba
I have functions f1 and f2 below, and let's say they run in T1 and T2 amount of time when running a single instance/thread. The issue I'm facing is that parallelizing f2 across 18 cores takes anywhere from 2-5X T2, and for more complex funcs takes absurdly long. 1. (defn f1 [] 2.