> `TreeMap` sub-maps use the default `IteratorSpliterator` implementation for > `TreeMap$EntrySetView` which is slow for some operations, because > `EntrySetView.size()` iterates all elements. This is most trivially shown by > something like `largeTreeMap.tailMap(0L, false).entrySet().limit(1).count()` > taking a long time. This showed up in my application, where it was trivial to > mitigate by switching to a for loop, but I think the fix is easy enough. > > `keySet()` does not have the same problem, as it provides a custom > `Spliterator` implementation which is not `Spliterator.SIZED`, and returns > `Long.MAX_VALUE` for `estimateSize()` (which is the recommended approach when > the size is expensive to compute). I'm *assuming* this optimization was > simply missed for the EntryIterator in the original implementation, but I > don't know for sure. > > This patch fixes the issue by using `Spliterators.spliteratorUnknownSize` for > `EntrySetView`, marking the size as unknown avoids accidentally iterating the > whole thing for simple operations. > > > Basic performance test: `map.tailMap(0L, > false).entrySet().stream().limit(1).count()` for a `TreeMap` with > `10_000_000` entries. > > Before (keySet is fast using `SubMapKeyIterator`, entrySet is slow): > > class java.util.TreeMap$KeySet > .stream().limit(1).count() took 0.046ms > spliterator = SubMapKeyIterator, estimateSize() = 9223372036854775807 > class java.util.TreeMap$AscendingSubMap$AscendingEntrySetView > .stream().limit(1).count() took 218ms > spliterator = IteratorSpliterator, estimateSize() = 9999999 > > > After (entrySet is now fast): > > class java.util.TreeMap$KeySet > .stream().limit(1).count() took 0.017ms > spliterator = SubMapKeyIterator, estimateSize() = 9223372036854775807 > class java.util.TreeMap$AscendingSubMap$AscendingEntrySetView > .stream().limit(1).count() took 0.115419ms > spliterator = IteratorSpliterator, estimateSize() = 9223372036854775807 > > > Behaviour is tested by the test cases added in > https://bugs.openjdk.org/browse/JDK-8376698. > > The Set default stream() implementation uses spliterator(), and since our > spliterator() implementation is no longer late-binding (it eagerly creates > the iterator), we override stream() to use a late-binding wrapper.
Oli Gillespie has updated the pull request incrementally with one additional commit since the last revision: Make stream() late-binding ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28608/files - new: https://git.openjdk.org/jdk/pull/28608/files/144216ba..4ebc2526 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28608&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28608&range=03-04 Stats: 8 lines in 2 files changed: 7 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28608.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28608/head:pull/28608 PR: https://git.openjdk.org/jdk/pull/28608
