Bob, I'm not saying that your implementation is bad. On contrary, your implementation is much faster than original Harmony one. What I'm saying is, ThreadLocalBench is still 3x slower on Harmony/DRLVM than on Sun 1.6.0_05. That might be connected with poor codegeneration on Harmony JIT, some disabled optimizations there, etc. I don't know what the real cause is. That's the question to answer.
Leaving Harmony/DRLVM aside, it's wonderful that your implementation is going neck-to-neck with Sun's on another VM. So, why don't to experiment and beat them? :) From this point-of-view, I raised the question on rationale of having open-addressed + linear-probed hashmap instead of, say, chaining. Taking away caching opportunities, Knuth had analyzed the average lookup count for different probing schemes, and linear probing performed the baddest. Actually, we had recently moved IdentityHashMap from the exactly the same addressing to chaining [1], and results are pretty good. There's additional +80% on ThreadLocalBench and I expect we can get them on your implementation too! Thanks, Aleksey. [1] https://issues.apache.org/jira/browse/HARMONY-5771 On Fri, Jun 6, 2008 at 8:36 PM, Bob Lee <[EMAIL PROTECTED]> wrote: > I'm not sure how you ran it, but I ran MTHarness and Doug Lea's own test > suites on Sun's VM using -server. > > Mine was a tad slower on MTHarness (results in bug). In Doug Lea's tests, > the RI and my impl were neck and neck. Josh Bloch ran the same performance > tests on a different machine and saw the same results: > > Tie: 3 tests > RI wins: 2 tests > crazybob's wins: 3 tests > > Mine also has a smaller memory footprint and collects reclaimed ThreadLocals > more aggressively (less unwanted memory retention). > > I'm not sure where you got that the performance is "bad". Are you sure you > ran the right code? > > Bob > > On Fri, Jun 6, 2008 at 3:23 AM, Aleksey Shipilev <[EMAIL PROTECTED]> > wrote: > >> On Fri, Jun 6, 2008 at 2:16 PM, Xiao-Feng Li <[EMAIL PROTECTED]> >> wrote: >> >> NB: This patch gives 7.6x boost on MTHarness/ThreadLocalBench and +25% >> >> to SPECjvm2008:serial. >> > Good numbers! I read the perf is still bad compared to RI? Have you >> > any estimation about the reason? >> The performance of ThreadLocal is bad compared to RI, while >> SPECjvm2008:serial is not (assuming all other patches are applied). I >> hadn't investigated the reason of ThreadLocal though, but I think the >> problematic area is open-addressed tuple-stored map implementation by >> Bob. Bob, had you tried other layout schemes (like splitting >> key/values arrays, various probing schemes, etc.)? What was the >> rationale behind this implementation? >> >> Thanks, >> Aleksey. >> >
