Re: Faster HashMap implementation

2009-07-24 Thread Alex Yakovlev
Doug, On Wed, Jul 22, 2009 at 8:21 PM, Doug Lea wrote: > On further evaluating it (among other possible HashMap replacements), > I'm not convinced that they are enough better to commit. > Size:... 9216 36864 147456 589824 > HashMap ...45 97208273 > V2 ...40 7

Re: Faster HashMap implementation

2009-07-24 Thread Ismael Juma
Hi Doug, Doug Lea writes: > On further evaluating it (among other possible HashMap replacements), > I'm not convinced that they are enough better to commit. That's a shame. What about that idea of having a factory method that would take a Class object and return an appropriate Map implementation

Re: Faster HashMap implementation

2009-07-22 Thread Doug Lea
Sorry for the delay getting back to this. I put an update of HashMapV2 at http://gee.cs.oswego.edu/dl/wwwtmp/HashMapV2.java It includes the better support for tiny maps and probe sequences that were missing. On further evaluating it (among other possible HashMap replacements), I'm not convince

Re: Faster HashMap implementation

2009-07-04 Thread Rémi Forax
Doug Lea a écrit : somehow 10 days for you is the same as 2 days for the rest of us. :) (Well, I was unexpectedly stranded at JFK for 20 hrs, but now really am in sporadic e-mail mode for 9 days.) More seriously, it would be great to have a HashMap that takes less memory and performs

Re: Faster HashMap implementation

2009-07-04 Thread Doug Lea
> somehow 10 > days > for you is the same as 2 days for the rest of us. :) (Well, I was unexpectedly stranded at JFK for 20 hrs, but now really am in sporadic e-mail mode for 9 days.) > > More seriously, it would be great to have a HashMap that takes less memory > and > performs as well or bette

Re: Faster HashMap implementation

2009-07-02 Thread Ismael Juma
Hi Doug, In a previous message you said: "There are a few remaining things to consider before taking this too seriously as a replacement. I'll be out for 10 days starting tomorrow so probably won't get a chance to do so soon." I now understand the secret behind your productivity levels... someho

Re: Faster HashMap implementation

2009-07-02 Thread Doug Lea
> Doug, > > Checking key reference equality before hashcode looks like a big win. This effect is somewhat stronger in tests that keep looking for == keys, but in general, successful gets are usually for identical keys. Except for autoboxed Integers etc. (Aside: one way to deal with fact that we c

Re: Faster HashMap implementation

2009-07-01 Thread Alex Yakovlev
Doug, Checking key reference equality before hashcode looks like a big win. Have you considered using balancing tree instead of bit trie? This could decrease look depth from number of hash bits to log2(tree size). Also, there could be a sence in using adjacent table cell for key/value if it's no

Re: Faster HashMap implementation

2009-07-01 Thread Doug Lea
While I'm posting in-progress stuff, here are some other notes on hash maps: First, some observations about usage (repeating some already posted): Via some dated but probably still valid dynamic measurments: Ops: Approx 84% read, 15% add/modify, 1% remove Sizes: Peak at 0, 1, 2, then mostl

Re: Faster HashMap implementation

2009-06-30 Thread Doug Lea
Inspired by the combination of Alex's efforts and ongoing work on concurrent caches, I put together a version of HashMap (called HashMapV2 for now) with a snapshot at http://gee.cs.oswego.edu/dl/wwwtmp/HashMapV2.java (This retains the openjdk GPL header etc.) There are a few remaining things t

Re: Faster HashMap implementation

2009-06-26 Thread Alex Yakovlev
Also it should be mentioned that memory-wise my implementation uses more memory in resize: it need to copy 2 arrays of ~2*C size (C = capacity) and current HashMap needs only one array of C size, data in Entry structures are not copied. This can be addressed if it is important by (1) splitting inte

Re: Faster HashMap implementation

2009-06-23 Thread Doug Lea
Sorry for the delay; I've been side-tracked exploring concurrent trie/hash algorithms But I did want to point out a problem shared with all purely table-based approaches, including yours. Assuming power of 2 capacities, you can only hold up to 1 << 29 entries, because max power of 2 array length

RE: Faster HashMap implementation

2009-06-13 Thread Robert DiFalco
: core-libs-dev@openjdk.java.net Subject: Re: Faster HashMap implementation Ismael Juma wrote: > Out of curiosity, do you know if any tests have been done with an open > addressing scheme similar to Python dictionaries[1] in Java? Near the top, > there's a comment explaining how it works and the

Re: Faster HashMap implementation

2009-06-13 Thread Doug Lea
Ismael Juma wrote: Out of curiosity, do you know if any tests have been done with an open addressing scheme similar to Python dictionaries[1] in Java? Near the top, there's a comment explaining how it works and the motivation. More or less. This is roughly similar to the schemes I mentioned a

Re: Faster HashMap implementation

2009-06-13 Thread Ismael Juma
Hi Doug, Doug Lea writes: > While open-addressing is used in > IdentityHashMap (and in a very specialized form in > ThreadLocal), you cannot live with linear-probed > versions otherwise: Many user-defined hashCodes > (and some JDK-defined ones too!) are not very good. Out of curiosity, do you kn

Re: Faster HashMap implementation

2009-06-09 Thread Alex Yakovlev
Doug, On Tue, Jun 9, 2009 at 4:09 PM, Doug Lea wrote: > Alex Yakovlev wrote: > >> entrySet() iterator which is very expensive. >> >> Very big speedup would be to reuse Entry object, >> > > We had originally done this in a few classes. > The Iterator spec even allows it. However, most > usages ig

Re: Faster HashMap implementation

2009-06-09 Thread Doug Lea
Alex Yakovlev wrote: entrySet() iterator which is very expensive. Very big speedup would be to reuse Entry object, We had originally done this in a few classes. The Iterator spec even allows it. However, most usages ignore that part of the spec, and it led to a bunch of bug reports, so now all

Re: Faster HashMap implementation

2009-06-08 Thread Stephen Colebourne
2009/6/8 Doug Lea : > It is possible to use a look-aside strategy for tiny > HashMaps. I think one of the Apache commons maps > does this. FYI - Commons Collections Flat3Map: http://commons.apache.org/collections/api-3.2/org/apache/commons/collections/map/Flat3Map.html Stephen

Re: Faster HashMap implementation

2009-06-08 Thread Alex Yakovlev
Doug, On your previous message: 1) I reverted hash function, 2) did some tuniing for MapCheck benchmark - equals, hashCode, toString, and putAll (if argument have the same type) now does not use entrySet() iterator which is very expensive. Very big speedup would be to reuse Entry object, since

Re: Faster HashMap implementation

2009-06-08 Thread Doug Lea
Florian Weimer wrote: Or don't use the hash structure at all and just do a sequential search. Then the index array isn't needed at all. It is possible to use a look-aside strategy for tiny HashMaps. I think one of the Apache commons maps does this. But the idea of using open-address tables to

Re: Faster HashMap implementation

2009-06-08 Thread Florian Weimer
* Doug Lea: > I once instrumented some of the "reference workload" usages > and found that over half of the HashMaps/Hashtables had > a max 3 or fewer elements during their lifetimes. > So on average using your version will likely increase > application footprints. It seems possible to deal with >

Re: Faster HashMap implementation

2009-06-08 Thread Doug Lea
A few notes on your implementation... Alex Yakovlev wrote: Main trick is in storing all data in 2 arrays without any HashEntry objects. One is array of Objects with keys and values, another in a complex array of indices. Notice that footprint comparisons with current HashMap are load-depen

Re: Faster HashMap implementation

2009-06-08 Thread Doug Lea
Alex Yakovlev wrote: Doug, I've finished HashMap, HashSet, LinkedHashMap, and LinkedHashSet porting to my approach. I run several tests I've found in jdk/test/java/util/* that are directly related to these classes, not sure I've found all of them. There are a lot of tests around for HashMap

Re: Faster HashMap implementation

2009-06-06 Thread Alex Yakovlev
Joe, Thank you for the link. What I found strange is that non-Linked versions (simple HashSet and HashMap) passes all tests with CollectionFeature.KNOWN_ORDER TestSuite source I ran (scala): http://github.com/alex14n/CompactHashMap/blob/e2b81fb9ccf5446eff961d86eb3a0dc084a8a88c/test/Tests.scala

Re: Faster HashMap implementation

2009-06-06 Thread Joe Kearney
Alex, Might I suggest you look at the test framework in Google Collections? Extremely good coverage and only a couple of lines to set up. Tests for maps in java.util: http://www.google.com/codesearch/p?hl=en#YXcrkXezIpQ/trunk/testfw/com/google/common/co

Re: Faster HashMap implementation

2009-06-06 Thread Alex Yakovlev
Doug, I've finished HashMap, HashSet, LinkedHashMap, and LinkedHashSet porting to my approach. I run several tests I've found in jdk/test/java/util/* that are directly related to these classes, not sure I've found all of them. I have not compiled the whole jdk classes - only these 4 (removed 'Fas

Re: Faster HashMap implementation

2009-06-04 Thread Alex Yakovlev
Doug, On Thu, Jun 4, 2009 at 4:32 PM, Doug Lea wrote: > > The main one is that LinkedHashMap is declared as a > subclass of HashMap. There's not > an obvious way to add insertion- or access- ordered links > to your version. Well, it can be done by adding another index array, I've commited it t

Re: Faster HashMap implementation

2009-06-04 Thread Doug Lea
Alex Yakovlev wrote: Hello, While playing with scala programming language I've come to a HashMap implementation that appears to be faster than current java.util.HashMap, also with a lower memory footprint. This is a promising approach. The indirection using the index array makes for a bett

Re: Faster HashMap implementation

2009-06-04 Thread Ulf Zibis
See also: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862 -Ulf Am 04.06.2009 01:21, Alex Yakovlev schrieb: Hello, While playing with scala programming language I've come to a HashMap implementation that appears to be faster than current java.util.HashMap, also with a lower memory

Faster HashMap implementation

2009-06-03 Thread Alex Yakovlev
Hello, While playing with scala programming language I've come to a HashMap implementation that appears to be faster than current java.util.HashMap, also with a lower memory footprint. Alex Miller advised me to post about it in this maillist. I've put current implementation (both scala and java)