Good evening,
a couple of months ago a fellow Java Champion told me that he had
"banned" streams at his company, or at least discouraged their use. The
reason was their high allocation rates with empty collections. With
traditional for loops, if the collection is empty, then hardly any
objects are allocated and it is very fast. But if we have a stream, then
we first have to build up the entire pipeline, only to discover that we
didn't need all those objects and throw them away again.
When communicating with Brian Goetz last week, I mentioned this to him
and he suggested that perhaps we could have the stream() method inside
Collection check whether it is empty, and if so, to return a specialized
class EmptyStream that returns "this" for methods such as filter() and
map(). I spent a bit of time trying to write such a class, together with
EmptyIntStream, EmptyLongStream and EmptyDoubleStream. I've also written
a set of tests that compare our Empty[Int|Long|Double]Streams to what
would be returned with Stream[Int|Long|Double].empty(). I've also
written a little benchmark to demonstrate its effectiveness.
You can see what I've done here:
https://github.com/openjdk/jdk/pull/6275
(I think I was premature in issuing the PR)
However, I have hit a brick wall with the way that the streams are
currently being tested in the JDK. First off, there are several tests
that make assumptions about how Stream is implemented and down-casts it
to an AbstractPipeline. Since our EmptyStream is not an
AbstractPipeline, the tests fail.
Secondly, with a normal stream, some of the methods can only be called
once, for example filter() and map(). They return a new stream and we
have to continue working with those. With my EmptyStream, since filter()
and map() return "this", we would not get an exception if we continued
using it.
Thirdly, with a normal stream, the method parallel() changes the state
of the current stream, but then returns "this". In order to keep the
EmptyStream consistent with the current Stream.empty() behavior, I
return StreamSupport.stream(Spliterators.emptySpliterator(), true) from
the parallel() method. Thus with the EmptyStream this is opposite to how
it currently happens to work. The Javadocs say that the parallel()
method "may return itself", but it does not have to, whereas the
filter() method seems to suggest that it would be a new stream objects,
but it also does not prescribe that it absolutely has to be.
How important is the white-box testing with the streams? And could we
perhaps make special cases for empty streams?
Regards
Heinz
--
Dr Heinz M. Kabutz (PhD CompSci)
Author of "The Java™ Specialists' Newsletter" - www.javaspecialists.eu
Java Champion - www.javachampions.org
JavaOne Rock Star Speaker
Tel: +30 69 75 595 262
Skype: kabutz