Thanks to Vitaly and Doug for being more concerned about offer/poll performance than I was initially. ArrayDeque is now back to using the head + tail + spare slot strategy, which keeps optimal performance for non-bulk operations, while everything else gets faster. We have new tests and new benchmarks, as a result of which new bugs were found and fixed, and this integration wave has grown rather larger than originally intended.
The JIT-friendly nested loop strategy was successful for making bulk operations on circular arrays faster, and this is used for both ArrayDeque and ArrayBlockingQueue. + /* + * VMs excel at optimizing simple array loops where indices are + * incrementing or decrementing over a valid slice, e.g. + * + * for (int i = start; i < end; i++) ... elements[i] + * + * Because in a circular array, elements are in general stored in + * two disjoint such slices, we help the VM by writing unusual + * nested loops for all traversals over the elements. + */ Bulk removal operations on array-based collections retreat to a two-pass algorithm, which is slightly slower (but still O(n)), but continues to support (questionable) predicates that reentrantly access the collection in read mode. http://cr.openjdk.java.net/~martin/webrevs/openjdk9/jsr166-jdk9-integration/ On Fri, Oct 21, 2016 at 12:32 PM, Martin Buchholz <marti...@google.com> wrote: > > > On Tue, Oct 18, 2016 at 4:18 PM, Martin Buchholz <marti...@google.com> > wrote: > >> >> >> On Tue, Oct 18, 2016 at 4:00 PM, Vitaly Davidovich <vita...@gmail.com> >> wrote: >> >>> >>>> > * Change in allocation/capacity policy. >>>> > >>>> > The removal of the power-of-two restriction, and applying a 1.5x >>>> growth >>>> > factor (same as ArrayList) seems sensible. Does this mean that the >>>> ability >>>> > to compute the proper array index by using x & (length-1) wasn't >>>> worth it? >>>> > Or at least not worth the extra tail wastage? >>>> > >>>> >>>> There's no integer division to optimize, as with hash tables. >>> >>> But it does add some new branches, doesn't it? Potentially branches that >>> aren't taken prior to JIT compilation, but taken later = deopt. >>> >> >> If it's a smidgeon slower, I don't care that much; improvement in >> flexibility and scalability are more important. >> >> Of course, I do care about algorithmic improvements to e.g. removeIf >> >> >>> Curious if you ran some benchmarks on ArrayDeque. >>> >> >> Not yet. Who would like to show how slow the code is? >> > > I revived my ancient IteratorMicroBenchmark for the latest webrev, made it > a proper jtreg test, added ArrayDeque Jobs, and verified that the new code > is measurably faster than the old code, at least for the mundane job of > iterating. > http://cr.openjdk.java.net/~martin/webrevs/openjdk9/ > jsr166-jdk9-integration/ArrayList/ > >