Good evening,

a couple of months ago a fellow Java Champion told me that he had "banned" streams at his company, or at least discouraged their use. The reason was their high allocation rates with empty collections. With traditional for loops, if the collection is empty, then hardly any objects are allocated and it is very fast. But if we have a stream, then we first have to build up the entire pipeline, only to discover that we didn't need all those objects and throw them away again.

When communicating with Brian Goetz last week, I mentioned this to him and he suggested that perhaps we could have the stream() method inside Collection check whether it is empty, and if so, to return a specialized class EmptyStream that returns "this" for methods such as filter() and map(). I spent a bit of time trying to write such a class, together with EmptyIntStream, EmptyLongStream and EmptyDoubleStream. I've also written a set of tests that compare our Empty[Int|Long|Double]Streams to what would be returned with Stream[Int|Long|Double].empty(). I've also written a little benchmark to demonstrate its effectiveness.

You can see what I've done here:

https://github.com/openjdk/jdk/pull/6275

(I think I was premature in issuing the PR)

However, I have hit a brick wall with the way that the streams are currently being tested in the JDK. First off, there are several tests that make assumptions about how Stream is implemented and down-casts it to an AbstractPipeline. Since our EmptyStream is not an AbstractPipeline, the tests fail.

Secondly, with a normal stream, some of the methods can only be called once, for example filter() and map(). They return a new stream and we have to continue working with those. With my EmptyStream, since filter() and map() return "this", we would not get an exception if we continued using it.

Thirdly, with a normal stream, the method parallel() changes the state of the current stream, but then returns "this". In order to keep the EmptyStream consistent with the current Stream.empty() behavior, I return StreamSupport.stream(Spliterators.emptySpliterator(), true) from the parallel() method. Thus with the EmptyStream this is opposite to how it currently happens to work. The Javadocs say that the parallel() method "may return itself", but it does not have to, whereas the filter() method seems to suggest that it would be a new stream objects, but it also does not prescribe that it absolutely has to be.

How important is the white-box testing with the streams? And could we perhaps make special cases for empty streams?

Regards

Heinz
--
Dr Heinz M. Kabutz (PhD CompSci)
Author of "The Java™ Specialists' Newsletter" - www.javaspecialists.eu
Java Champion - www.javachampions.org
JavaOne Rock Star Speaker
Tel: +30 69 75 595 262
Skype: kabutz

Reply via email to