On Fri, 5 Nov 2021 12:53:46 GMT, kabutz <d...@openjdk.java.net> wrote:

> This is a draft proposal for how we could improve stream performance for the 
> case where the streams are empty. Empty collections are common-place. If we 
> iterate over them with an Iterator, we would have to create one small 
> Iterator object (which could often be eliminated) and if it is empty we are 
> done. However, with Streams we first have to build up the entire pipeline, 
> until we realize that there is no work to do. With this example, we change 
> Collection#stream() to first check if the collection is empty, and if it is, 
> we simply return an EmptyStream. We also have EmptyIntStream, EmptyLongStream 
> and EmptyDoubleStream. We have taken great care for these to have the same 
> characteristics and behaviour as the streams returned by Stream.empty(), 
> IntStream.empty(), etc. 
> 
> Some of the JDK tests fail with this, due to ClassCastExceptions (our 
> EmptyStream is not an AbstractPipeline) and AssertionError, since we can call 
> some methods repeatedly on the stream without it failing. On the plus side, 
> creating a complex stream on an empty stream gives us upwards of 50x increase 
> in performance due to a much smaller object allocation rate. This PR includes 
> the code for the change, unit tests and also a JMH benchmark to demonstrate 
> the improvement.

I've added far more JMH tests to check for stream size of [0, 1, 10, 100], then 
different types of streams, from minimal to complex, then five different 
collections ArrayList, ConcurrentLinkedQueue, ConcurrentSkipListSet, 
CopyOnWriteArrayList, ConcurrentHashMap.newKeySet() and lastly with 
Stream.empty() that has the new behavior and StreamSupport.stream(...) that was 
the old behavior.

With all this multiplication of test options, it will take a couple of days to 
run on my server. Will let you know if anything surprising pops up.

Thanks for your excellent suggestions. It seemed that we cannot get away from 
making some objects. I've got a new version in the works that makes EmptyStream 
instances each time, with their own state. The performance is still good. For a 
simple stream pipeline, it is roughly twice as fast for an empty stream. There 
is no noticeable slow-down for non-empty streams.

`EmptyStreams` now follow the same behavior as we would get if we created the 
stream with `StreamSupport.stream(collection.spliterator(), false)`. For 
example, we cannot call `filter()` / `map()` etc. twice on the same stream. We 
can call `unordered()` twice on the same stream, but only if it was not 
`ORDERED` to begin with.

Streams are not created lazily yet in my updated version, but I'm hoping it is 
a step in the right direction.

I'm running an extensive JMH suite overnight to compare object allocation rates 
and throughput for the streams.

Here are the maximum throughput numbers for the JMH benchmark for various 
lengths, collections, type of stream decorations and whether it is the old 
style stream creation or the new EmptyStream. We also see the speedup with the 
new system. There are some strange effects where we see small differences once 
we get past 0 length, which is most likely to be explained because of noise in 
the benchmark.

The speedup in the benchmark for empty streams seems to be from about 2x for 
minimal stream decoration through to about 9x for the more complex kind.


a_length, b_typeOfCollection, c_typeOfStreamDecoration, d_streamCreation, max 
op/ms, speedup
0, ArrayList, minimal, old, 20972.400
0, ArrayList, minimal, new, 44866.020, 2.13x
0, ArrayList, basic, old, 19219.077
0, ArrayList, basic, new, 47574.102, 2.47x
0, ArrayList, complex, old, 9425.247
0, ArrayList, complex, new, 44850.655, 4.75x
0, ArrayList, crossover, old, 4749.495
0, ArrayList, crossover, new, 44550.110, 9.37x
0, ConcurrentLinkedQueue, minimal, old, 20146.717
0, ConcurrentLinkedQueue, minimal, new, 36154.586, 1.79x
0, ConcurrentLinkedQueue, basic, old, 18107.648
0, ConcurrentLinkedQueue, basic, new, 36094.319, 1.99x
0, ConcurrentLinkedQueue, complex, old, 9033.936
0, ConcurrentLinkedQueue, complex, new, 36089.520, 3.99x
0, ConcurrentLinkedQueue, crossover, old, 4555.928
0, ConcurrentLinkedQueue, crossover, new, 35932.132, 7.88x
0, ConcurrentSkipListSet, minimal, old, 20391.479
0, ConcurrentSkipListSet, minimal, new, 40259.694, 1.97x
0, ConcurrentSkipListSet, basic, old, 18616.301
0, ConcurrentSkipListSet, basic, new, 40914.132, 2.19x
0, ConcurrentSkipListSet, complex, old, 9853.569
0, ConcurrentSkipListSet, complex, new, 40150.606, 4.07x
0, ConcurrentSkipListSet, crossover, old, 4632.691
0, ConcurrentSkipListSet, crossover, new, 40151.534, 8.66x
0, CopyOnWriteArrayList, minimal, old, 19952.257
0, CopyOnWriteArrayList, minimal, new, 36884.796, 1.84x
0, CopyOnWriteArrayList, basic, old, 18228.390
0, CopyOnWriteArrayList, basic, new, 36993.146, 2.02x
0, CopyOnWriteArrayList, complex, old, 9635.404
0, CopyOnWriteArrayList, complex, new, 36862.714, 3.82x
0, CopyOnWriteArrayList, crossover, old, 4609.905
0, CopyOnWriteArrayList, crossover, new, 36043.686, 7.81x
0, ConcurrentHashMap, minimal, old, 20158.432
0, ConcurrentHashMap, minimal, new, 42272.848, 2.09x
0, ConcurrentHashMap, basic, old, 19308.950
0, ConcurrentHashMap, basic, new, 42475.514, 2.19x
0, ConcurrentHashMap, complex, old, 9864.942
0, ConcurrentHashMap, complex, new, 42195.862, 4.27x
0, ConcurrentHashMap, crossover, old, 4721.981
0, ConcurrentHashMap, crossover, new, 42055.634, 8.90x

1, ArrayList, minimal, old, 18999.510
1, ArrayList, minimal, new, 19116.471, 1.00x
1, ArrayList, basic, old, 17698.417
1, ArrayList, basic, new, 17246.340, .97x
1, ArrayList, complex, old, 6788.291
1, ArrayList, complex, new, 7456.593, 1.09x
1, ArrayList, crossover, old, 3829.277
1, ArrayList, crossover, new, 3798.721, .99x
1, ConcurrentLinkedQueue, minimal, old, 18535.251
1, ConcurrentLinkedQueue, minimal, new, 18661.126, 1.00x
1, ConcurrentLinkedQueue, basic, old, 12795.769
1, ConcurrentLinkedQueue, basic, new, 16666.953, 1.30x
1, ConcurrentLinkedQueue, complex, old, 5890.533
1, ConcurrentLinkedQueue, complex, new, 6541.393, 1.11x
1, ConcurrentLinkedQueue, crossover, old, 3615.558
1, ConcurrentLinkedQueue, crossover, new, 3738.650, 1.03x
1, ConcurrentSkipListSet, minimal, old, 18574.394
1, ConcurrentSkipListSet, minimal, new, 17797.298, .95x
1, ConcurrentSkipListSet, basic, old, 16500.841
1, ConcurrentSkipListSet, basic, new, 15993.662, .96x
1, ConcurrentSkipListSet, complex, old, 6101.255
1, ConcurrentSkipListSet, complex, new, 6196.274, 1.01x
1, ConcurrentSkipListSet, crossover, old, 3965.991
1, ConcurrentSkipListSet, crossover, new, 3819.579, .96x
1, CopyOnWriteArrayList, minimal, old, 18597.613
1, CopyOnWriteArrayList, minimal, new, 18700.468, 1.00x
1, CopyOnWriteArrayList, basic, old, 16826.926
1, CopyOnWriteArrayList, basic, new, 12694.816, .75x
1, CopyOnWriteArrayList, complex, old, 7089.013
1, CopyOnWriteArrayList, complex, new, 6919.550, .97x
1, CopyOnWriteArrayList, crossover, old, 4100.437
1, CopyOnWriteArrayList, crossover, new, 3773.781, .92x
1, ConcurrentHashMap, minimal, old, 13868.779
1, ConcurrentHashMap, minimal, new, 13791.894, .99x
1, ConcurrentHashMap, basic, old, 12667.393
1, ConcurrentHashMap, basic, new, 9863.169, .77x
1, ConcurrentHashMap, complex, old, 5885.494
1, ConcurrentHashMap, complex, new, 5412.376, .91x
1, ConcurrentHashMap, crossover, old, 3323.164
1, ConcurrentHashMap, crossover, new, 3317.024, .99x

10, ArrayList, minimal, old, 17633.130
10, ArrayList, minimal, new, 17561.076, .99x
10, ArrayList, basic, old, 10293.698
10, ArrayList, basic, new, 9284.825, .90x
10, ArrayList, complex, old, 3757.234
10, ArrayList, complex, new, 3805.699, 1.01x
10, ArrayList, crossover, old, 2355.630
10, ArrayList, crossover, new, 2290.404, .97x
10, ConcurrentLinkedQueue, minimal, old, 13414.620
10, ConcurrentLinkedQueue, minimal, new, 9801.745, .73x
10, ConcurrentLinkedQueue, basic, old, 7155.959
10, ConcurrentLinkedQueue, basic, new, 8075.986, 1.12x
10, ConcurrentLinkedQueue, complex, old, 3343.567
10, ConcurrentLinkedQueue, complex, new, 3516.111, 1.05x
10, ConcurrentLinkedQueue, crossover, old, 2288.908
10, ConcurrentLinkedQueue, crossover, new, 2269.025, .99x
10, ConcurrentSkipListSet, minimal, old, 12725.965
10, ConcurrentSkipListSet, minimal, new, 12450.308, .97x
10, ConcurrentSkipListSet, basic, old, 9063.070
10, ConcurrentSkipListSet, basic, new, 8798.386, .97x
10, ConcurrentSkipListSet, complex, old, 3588.505
10, ConcurrentSkipListSet, complex, new, 3940.039, 1.09x
10, ConcurrentSkipListSet, crossover, old, 2045.747
10, ConcurrentSkipListSet, crossover, new, 2266.930, 1.10x
10, CopyOnWriteArrayList, minimal, old, 13787.391
10, CopyOnWriteArrayList, minimal, new, 13914.875, 1.00x
10, CopyOnWriteArrayList, basic, old, 8697.165
10, CopyOnWriteArrayList, basic, new, 10222.606, 1.17x
10, CopyOnWriteArrayList, complex, old, 3614.631
10, CopyOnWriteArrayList, complex, new, 3406.912, .94x
10, CopyOnWriteArrayList, crossover, old, 2325.973
10, CopyOnWriteArrayList, crossover, new, 2345.024, 1.00x
10, ConcurrentHashMap, minimal, old, 9591.056
10, ConcurrentHashMap, minimal, new, 9145.649, .95x
10, ConcurrentHashMap, basic, old, 6243.206
10, ConcurrentHashMap, basic, new, 7104.218, 1.13x
10, ConcurrentHashMap, complex, old, 3024.104
10, ConcurrentHashMap, complex, new, 2937.575, .97x
10, ConcurrentHashMap, crossover, old, 2082.300
10, ConcurrentHashMap, crossover, new, 1893.942, .90x

100, ArrayList, minimal, old, 4263.776
100, ArrayList, minimal, new, 4305.783, 1.00x
100, ArrayList, basic, old, 1615.447
100, ArrayList, basic, new, 2075.383, 1.28x
100, ArrayList, complex, old, 557.891
100, ArrayList, complex, new, 544.918, .97x
100, ArrayList, crossover, old, 428.983
100, ArrayList, crossover, new, 427.370, .99x
100, ConcurrentLinkedQueue, minimal, old, 1712.969
100, ConcurrentLinkedQueue, minimal, new, 1695.046, .98x
100, ConcurrentLinkedQueue, basic, old, 1681.057
100, ConcurrentLinkedQueue, basic, new, 1515.314, .90x
100, ConcurrentLinkedQueue, complex, old, 608.716
100, ConcurrentLinkedQueue, complex, new, 601.803, .98x
100, ConcurrentLinkedQueue, crossover, old, 414.658
100, ConcurrentLinkedQueue, crossover, new, 420.125, 1.01x
100, ConcurrentSkipListSet, minimal, old, 2938.635
100, ConcurrentSkipListSet, minimal, new, 3037.409, 1.03x
100, ConcurrentSkipListSet, basic, old, 1434.130
100, ConcurrentSkipListSet, basic, new, 1560.646, 1.08x
100, ConcurrentSkipListSet, complex, old, 599.722
100, ConcurrentSkipListSet, complex, new, 608.418, 1.01x
100, ConcurrentSkipListSet, crossover, old, 401.228
100, ConcurrentSkipListSet, crossover, new, 400.425, .99x
100, CopyOnWriteArrayList, minimal, old, 3778.017
100, CopyOnWriteArrayList, minimal, new, 3781.097, 1.00x
100, CopyOnWriteArrayList, basic, old, 2146.891
100, CopyOnWriteArrayList, basic, new, 2149.660, 1.00x
100, CopyOnWriteArrayList, complex, old, 631.061
100, CopyOnWriteArrayList, complex, new, 621.835, .98x
100, CopyOnWriteArrayList, crossover, old, 424.637
100, CopyOnWriteArrayList, crossover, new, 425.788, 1.00x
100, ConcurrentHashMap, minimal, old, 1222.579
100, ConcurrentHashMap, minimal, new, 1221.157, .99x
100, ConcurrentHashMap, basic, old, 851.419
100, ConcurrentHashMap, basic, new, 859.273, 1.00x
100, ConcurrentHashMap, complex, old, 386.231
100, ConcurrentHashMap, complex, new, 391.380, 1.01x
100, ConcurrentHashMap, crossover, old, 343.521
100, ConcurrentHashMap, crossover, new, 339.079, .98x

-------------

PR: https://git.openjdk.java.net/jdk/pull/6275

Reply via email to