Overall, this kind of change seems barely worth doing. ---
This change: // Let gc do its work - for (int i = 0; i < size; i++) - elementData[i] = null; + Arrays.fill(elementData, null); changes the performance of clear from O(size) to O(capacity), which is bad, and unrelated to the "empty collection optimization". --- + if(elementData == EMPTY_ELEMENTDATA) { Please use standard jdk style - space after keywords such as "if". --- 183 public void ensureCapacity(int minCapacity) { 184 if (minCapacity > 0) 185 ensureExplicitCapacity(minCapacity); 186 } 187 188 private void ensureCapacityInternal(int minCapacity) { 189 if(elementData == EMPTY_ELEMENTDATA) { 190 minCapacity = Math.max(10, minCapacity); 191 } 192 193 ensureExplicitCapacity(minCapacity); 194 } It looks to me like, if the user does x = new ArrayList(); x.ensureCapacity(2); then the capacity will be 2, not 10, an observable change from the previous implementation, and sort-of contradicting the spec of ArrayList() --- 604 Arrays.fill(elementData, newSize, size, null); In performance-critical code I would avoid Arrays.fill because it adds a bit of overhead (unless it's intrinsified, which I don't think it is). --- If we are willing to make small incompatible changes, how about when serializing, storing capacity as the size, so that reconstituted ArrayLists are pre-trimmed to size? --- I still believe that with the help of the gc team, one can cook up a trim-to-size when gc-ing fix, but that's going to be very hard to implement. --- On Mon, Apr 1, 2013 at 7:00 PM, Mike Duigou <mike.dui...@oracle.com> wrote: > Hello all; > > I have posted an updated version of the empty ArrayList and HashMap patch. > > http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/ > > This revised implementation introduces *no new fields* to either class. > For ArrayList the lazy allocation of the backing array occurs only if the > list is created at default size. According to our performance analysis > team, approximately 85% of ArrayList instances are created at default size > so this optimization will be valid for an overwhelming majority of cases. > > For HashMap, creative use is made of the threshold field to track the > requested initial size until the bucket array is needed. On the read side > the empty map case is tested with isEmpty(). On the write size a comparison > of (table == EMPTY_TABLE) is used to detect the need to inflate the bucket > array. In readObject there's a little more work to try to choose an > efficient initial capacity. > > Mike > > On Mar 26 2013, at 17:25 , Mike Duigou wrote: > > > Hello all; > > > > This is a review for optimization work that came out of internal > analysis of Oracle's Java applications. It's based upon analysis that shows > that in large applications as much as 10% of maps and lists are initialized > but never receive any entries. A smaller number spend a large proportion of > their lifetime empty. We've found similar results across other workloads as > well. This patch is not a substitute for pre-sizing your collections and > maps--doing so will *always* have better results. > > > > This patch extends HashMap and ArrayList to provide special handling for > newly created instances that avoids creating the backing array until > needed. There is a very small additional cost for detecting when to inflate > the map or list that is measurable in interpreted tests but disappears in > JITed code. > > > > http://cr.openjdk.java.net/~mduigou/JDK-7143928/0/webrev/ > > > > We expect that should this code prove successful in Java 8 it will be > backported to Java 7 updates. > > > > The unit test may appear to be somewhat unrelated. It was created after > resolving a bug in an early version of this patch to detect the issue > encountered (LinkedHashMap.init() was not being called in readObject() when > the map was empty). > > > > Mike > >