On Fri, 5 Nov 2021 12:53:46 GMT, kabutz <d...@openjdk.java.net> wrote:

> This is a draft proposal for how we could improve stream performance for the 
> case where the streams are empty. Empty collections are common-place. If we 
> iterate over them with an Iterator, we would have to create one small 
> Iterator object (which could often be eliminated) and if it is empty we are 
> done. However, with Streams we first have to build up the entire pipeline, 
> until we realize that there is no work to do. With this example, we change 
> Collection#stream() to first check if the collection is empty, and if it is, 
> we simply return an EmptyStream. We also have EmptyIntStream, EmptyLongStream 
> and EmptyDoubleStream. We have taken great care for these to have the same 
> characteristics and behaviour as the streams returned by Stream.empty(), 
> IntStream.empty(), etc. 
> 
> Some of the JDK tests fail with this, due to ClassCastExceptions (our 
> EmptyStream is not an AbstractPipeline) and AssertionError, since we can call 
> some methods repeatedly on the stream without it failing. On the plus side, 
> creating a complex stream on an empty stream gives us upwards of 50x increase 
> in performance due to a much smaller object allocation rate. This PR includes 
> the code for the change, unit tests and also a JMH benchmark to demonstrate 
> the improvement.

Streams are closeable, and a terminal operation may be invoked on a given 
stream only once. Thus, shouldn't the third line in both of the examples below 
throw `IllegalStateException`?

        Stream<Object> empty = Stream.empty();
        System.out.println(empty.count());
        System.out.println(empty.count());

        Stream<Object> empty = Stream.empty();
        empty.close();
        System.out.println(empty.count());

I don't think that we can remove all the state from an empty stream, but we can 
likely make it smaller.

src/java.base/share/classes/java/util/Collection.java line 743:

> 741:      */
> 742:     default Stream<E> stream() {
> 743:         if (isEmpty()) return Stream.empty();

The net effect of this change might depend on your workload. If you call 
stream() on empty collections that have cheap isEmpty(), this change will 
likely improve performance and reduce waste. However, this same change might do 
the opposite if some of your collections aren't empty or have costly isEmpty(). 
It would be good to have benchmarks for different workloads.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6275

Reply via email to