On Fri, 5 Nov 2021 12:53:46 GMT, kabutz <d...@openjdk.java.net> wrote:
> This is a draft proposal for how we could improve stream performance for the > case where the streams are empty. Empty collections are common-place. If we > iterate over them with an Iterator, we would have to create one small > Iterator object (which could often be eliminated) and if it is empty we are > done. However, with Streams we first have to build up the entire pipeline, > until we realize that there is no work to do. With this example, we change > Collection#stream() to first check if the collection is empty, and if it is, > we simply return an EmptyStream. We also have EmptyIntStream, EmptyLongStream > and EmptyDoubleStream. We have taken great care for these to have the same > characteristics and behaviour as the streams returned by Stream.empty(), > IntStream.empty(), etc. > > Some of the JDK tests fail with this, due to ClassCastExceptions (our > EmptyStream is not an AbstractPipeline) and AssertionError, since we can call > some methods repeatedly on the stream without it failing. On the plus side, > creating a complex stream on an empty stream gives us upwards of 50x increase > in performance due to a much smaller object allocation rate. This PR includes > the code for the change, unit tests and also a JMH benchmark to demonstrate > the improvement. I've added far more JMH tests to check for stream size of [0, 1, 10, 100], then different types of streams, from minimal to complex, then five different collections ArrayList, ConcurrentLinkedQueue, ConcurrentSkipListSet, CopyOnWriteArrayList, ConcurrentHashMap.newKeySet() and lastly with Stream.empty() that has the new behavior and StreamSupport.stream(...) that was the old behavior. With all this multiplication of test options, it will take a couple of days to run on my server. Will let you know if anything surprising pops up. Thanks for your excellent suggestions. It seemed that we cannot get away from making some objects. I've got a new version in the works that makes EmptyStream instances each time, with their own state. The performance is still good. For a simple stream pipeline, it is roughly twice as fast for an empty stream. There is no noticeable slow-down for non-empty streams. `EmptyStreams` now follow the same behavior as we would get if we created the stream with `StreamSupport.stream(collection.spliterator(), false)`. For example, we cannot call `filter()` / `map()` etc. twice on the same stream. We can call `unordered()` twice on the same stream, but only if it was not `ORDERED` to begin with. Streams are not created lazily yet in my updated version, but I'm hoping it is a step in the right direction. I'm running an extensive JMH suite overnight to compare object allocation rates and throughput for the streams. Here are the maximum throughput numbers for the JMH benchmark for various lengths, collections, type of stream decorations and whether it is the old style stream creation or the new EmptyStream. We also see the speedup with the new system. There are some strange effects where we see small differences once we get past 0 length, which is most likely to be explained because of noise in the benchmark. The speedup in the benchmark for empty streams seems to be from about 2x for minimal stream decoration through to about 9x for the more complex kind. a_length, b_typeOfCollection, c_typeOfStreamDecoration, d_streamCreation, max op/ms, speedup 0, ArrayList, minimal, old, 20972.400 0, ArrayList, minimal, new, 44866.020, 2.13x 0, ArrayList, basic, old, 19219.077 0, ArrayList, basic, new, 47574.102, 2.47x 0, ArrayList, complex, old, 9425.247 0, ArrayList, complex, new, 44850.655, 4.75x 0, ArrayList, crossover, old, 4749.495 0, ArrayList, crossover, new, 44550.110, 9.37x 0, ConcurrentLinkedQueue, minimal, old, 20146.717 0, ConcurrentLinkedQueue, minimal, new, 36154.586, 1.79x 0, ConcurrentLinkedQueue, basic, old, 18107.648 0, ConcurrentLinkedQueue, basic, new, 36094.319, 1.99x 0, ConcurrentLinkedQueue, complex, old, 9033.936 0, ConcurrentLinkedQueue, complex, new, 36089.520, 3.99x 0, ConcurrentLinkedQueue, crossover, old, 4555.928 0, ConcurrentLinkedQueue, crossover, new, 35932.132, 7.88x 0, ConcurrentSkipListSet, minimal, old, 20391.479 0, ConcurrentSkipListSet, minimal, new, 40259.694, 1.97x 0, ConcurrentSkipListSet, basic, old, 18616.301 0, ConcurrentSkipListSet, basic, new, 40914.132, 2.19x 0, ConcurrentSkipListSet, complex, old, 9853.569 0, ConcurrentSkipListSet, complex, new, 40150.606, 4.07x 0, ConcurrentSkipListSet, crossover, old, 4632.691 0, ConcurrentSkipListSet, crossover, new, 40151.534, 8.66x 0, CopyOnWriteArrayList, minimal, old, 19952.257 0, CopyOnWriteArrayList, minimal, new, 36884.796, 1.84x 0, CopyOnWriteArrayList, basic, old, 18228.390 0, CopyOnWriteArrayList, basic, new, 36993.146, 2.02x 0, CopyOnWriteArrayList, complex, old, 9635.404 0, CopyOnWriteArrayList, complex, new, 36862.714, 3.82x 0, CopyOnWriteArrayList, crossover, old, 4609.905 0, CopyOnWriteArrayList, crossover, new, 36043.686, 7.81x 0, ConcurrentHashMap, minimal, old, 20158.432 0, ConcurrentHashMap, minimal, new, 42272.848, 2.09x 0, ConcurrentHashMap, basic, old, 19308.950 0, ConcurrentHashMap, basic, new, 42475.514, 2.19x 0, ConcurrentHashMap, complex, old, 9864.942 0, ConcurrentHashMap, complex, new, 42195.862, 4.27x 0, ConcurrentHashMap, crossover, old, 4721.981 0, ConcurrentHashMap, crossover, new, 42055.634, 8.90x 1, ArrayList, minimal, old, 18999.510 1, ArrayList, minimal, new, 19116.471, 1.00x 1, ArrayList, basic, old, 17698.417 1, ArrayList, basic, new, 17246.340, .97x 1, ArrayList, complex, old, 6788.291 1, ArrayList, complex, new, 7456.593, 1.09x 1, ArrayList, crossover, old, 3829.277 1, ArrayList, crossover, new, 3798.721, .99x 1, ConcurrentLinkedQueue, minimal, old, 18535.251 1, ConcurrentLinkedQueue, minimal, new, 18661.126, 1.00x 1, ConcurrentLinkedQueue, basic, old, 12795.769 1, ConcurrentLinkedQueue, basic, new, 16666.953, 1.30x 1, ConcurrentLinkedQueue, complex, old, 5890.533 1, ConcurrentLinkedQueue, complex, new, 6541.393, 1.11x 1, ConcurrentLinkedQueue, crossover, old, 3615.558 1, ConcurrentLinkedQueue, crossover, new, 3738.650, 1.03x 1, ConcurrentSkipListSet, minimal, old, 18574.394 1, ConcurrentSkipListSet, minimal, new, 17797.298, .95x 1, ConcurrentSkipListSet, basic, old, 16500.841 1, ConcurrentSkipListSet, basic, new, 15993.662, .96x 1, ConcurrentSkipListSet, complex, old, 6101.255 1, ConcurrentSkipListSet, complex, new, 6196.274, 1.01x 1, ConcurrentSkipListSet, crossover, old, 3965.991 1, ConcurrentSkipListSet, crossover, new, 3819.579, .96x 1, CopyOnWriteArrayList, minimal, old, 18597.613 1, CopyOnWriteArrayList, minimal, new, 18700.468, 1.00x 1, CopyOnWriteArrayList, basic, old, 16826.926 1, CopyOnWriteArrayList, basic, new, 12694.816, .75x 1, CopyOnWriteArrayList, complex, old, 7089.013 1, CopyOnWriteArrayList, complex, new, 6919.550, .97x 1, CopyOnWriteArrayList, crossover, old, 4100.437 1, CopyOnWriteArrayList, crossover, new, 3773.781, .92x 1, ConcurrentHashMap, minimal, old, 13868.779 1, ConcurrentHashMap, minimal, new, 13791.894, .99x 1, ConcurrentHashMap, basic, old, 12667.393 1, ConcurrentHashMap, basic, new, 9863.169, .77x 1, ConcurrentHashMap, complex, old, 5885.494 1, ConcurrentHashMap, complex, new, 5412.376, .91x 1, ConcurrentHashMap, crossover, old, 3323.164 1, ConcurrentHashMap, crossover, new, 3317.024, .99x 10, ArrayList, minimal, old, 17633.130 10, ArrayList, minimal, new, 17561.076, .99x 10, ArrayList, basic, old, 10293.698 10, ArrayList, basic, new, 9284.825, .90x 10, ArrayList, complex, old, 3757.234 10, ArrayList, complex, new, 3805.699, 1.01x 10, ArrayList, crossover, old, 2355.630 10, ArrayList, crossover, new, 2290.404, .97x 10, ConcurrentLinkedQueue, minimal, old, 13414.620 10, ConcurrentLinkedQueue, minimal, new, 9801.745, .73x 10, ConcurrentLinkedQueue, basic, old, 7155.959 10, ConcurrentLinkedQueue, basic, new, 8075.986, 1.12x 10, ConcurrentLinkedQueue, complex, old, 3343.567 10, ConcurrentLinkedQueue, complex, new, 3516.111, 1.05x 10, ConcurrentLinkedQueue, crossover, old, 2288.908 10, ConcurrentLinkedQueue, crossover, new, 2269.025, .99x 10, ConcurrentSkipListSet, minimal, old, 12725.965 10, ConcurrentSkipListSet, minimal, new, 12450.308, .97x 10, ConcurrentSkipListSet, basic, old, 9063.070 10, ConcurrentSkipListSet, basic, new, 8798.386, .97x 10, ConcurrentSkipListSet, complex, old, 3588.505 10, ConcurrentSkipListSet, complex, new, 3940.039, 1.09x 10, ConcurrentSkipListSet, crossover, old, 2045.747 10, ConcurrentSkipListSet, crossover, new, 2266.930, 1.10x 10, CopyOnWriteArrayList, minimal, old, 13787.391 10, CopyOnWriteArrayList, minimal, new, 13914.875, 1.00x 10, CopyOnWriteArrayList, basic, old, 8697.165 10, CopyOnWriteArrayList, basic, new, 10222.606, 1.17x 10, CopyOnWriteArrayList, complex, old, 3614.631 10, CopyOnWriteArrayList, complex, new, 3406.912, .94x 10, CopyOnWriteArrayList, crossover, old, 2325.973 10, CopyOnWriteArrayList, crossover, new, 2345.024, 1.00x 10, ConcurrentHashMap, minimal, old, 9591.056 10, ConcurrentHashMap, minimal, new, 9145.649, .95x 10, ConcurrentHashMap, basic, old, 6243.206 10, ConcurrentHashMap, basic, new, 7104.218, 1.13x 10, ConcurrentHashMap, complex, old, 3024.104 10, ConcurrentHashMap, complex, new, 2937.575, .97x 10, ConcurrentHashMap, crossover, old, 2082.300 10, ConcurrentHashMap, crossover, new, 1893.942, .90x 100, ArrayList, minimal, old, 4263.776 100, ArrayList, minimal, new, 4305.783, 1.00x 100, ArrayList, basic, old, 1615.447 100, ArrayList, basic, new, 2075.383, 1.28x 100, ArrayList, complex, old, 557.891 100, ArrayList, complex, new, 544.918, .97x 100, ArrayList, crossover, old, 428.983 100, ArrayList, crossover, new, 427.370, .99x 100, ConcurrentLinkedQueue, minimal, old, 1712.969 100, ConcurrentLinkedQueue, minimal, new, 1695.046, .98x 100, ConcurrentLinkedQueue, basic, old, 1681.057 100, ConcurrentLinkedQueue, basic, new, 1515.314, .90x 100, ConcurrentLinkedQueue, complex, old, 608.716 100, ConcurrentLinkedQueue, complex, new, 601.803, .98x 100, ConcurrentLinkedQueue, crossover, old, 414.658 100, ConcurrentLinkedQueue, crossover, new, 420.125, 1.01x 100, ConcurrentSkipListSet, minimal, old, 2938.635 100, ConcurrentSkipListSet, minimal, new, 3037.409, 1.03x 100, ConcurrentSkipListSet, basic, old, 1434.130 100, ConcurrentSkipListSet, basic, new, 1560.646, 1.08x 100, ConcurrentSkipListSet, complex, old, 599.722 100, ConcurrentSkipListSet, complex, new, 608.418, 1.01x 100, ConcurrentSkipListSet, crossover, old, 401.228 100, ConcurrentSkipListSet, crossover, new, 400.425, .99x 100, CopyOnWriteArrayList, minimal, old, 3778.017 100, CopyOnWriteArrayList, minimal, new, 3781.097, 1.00x 100, CopyOnWriteArrayList, basic, old, 2146.891 100, CopyOnWriteArrayList, basic, new, 2149.660, 1.00x 100, CopyOnWriteArrayList, complex, old, 631.061 100, CopyOnWriteArrayList, complex, new, 621.835, .98x 100, CopyOnWriteArrayList, crossover, old, 424.637 100, CopyOnWriteArrayList, crossover, new, 425.788, 1.00x 100, ConcurrentHashMap, minimal, old, 1222.579 100, ConcurrentHashMap, minimal, new, 1221.157, .99x 100, ConcurrentHashMap, basic, old, 851.419 100, ConcurrentHashMap, basic, new, 859.273, 1.00x 100, ConcurrentHashMap, complex, old, 386.231 100, ConcurrentHashMap, complex, new, 391.380, 1.01x 100, ConcurrentHashMap, crossover, old, 343.521 100, ConcurrentHashMap, crossover, new, 339.079, .98x ------------- PR: https://git.openjdk.java.net/jdk/pull/6275