Hi. Kevin Bourrillion pointed me at the JDK-8148917 thread, since he and I
have discussed one-shot Iterables a number of times.

>From my reading of the discussion so far, it sounds like the expectation is
that "Stream implements IterableOnce" will be a net win for foreach but
possibly a net loss for Iterable-accepting methods. Since Google's codebase
(including Guava and including Truth, our equivalent to AssertJ) contains a
lot of Iterable-accepting methods, I'll offer some anecdotes (and maybe a
smidge of data).

First, I picked a random sample of 40 Iterable-accepting methods from
Google's codebase. I found that 5 of them iterate the Iterable multiple
times. The reasons I saw were:
- performing an emptiness check before doing more expensive work (could be
done with a single iterator and hasNext() -- unless the other work requires
passing in an Iterable)
- retrieving the first element to extract some fields that are expected to
match for all inputs (ditto)
- iterating more than once because it was natural to (I didn't investigate
enough to see if it was possible to avoid, except of course by copying the
full input)

I have mixed feelings about these methods. On the one hand,
project-specific methods can generally get away with multiple iteration
just fine. But ideally, an implementation would iterate only once. That
way, it can accept, say, a CopyOnWriteArrayList and operate on a consistent
view of it. Additionally, repeat iteration may perform worse. For these
reasons, Guava and Truth generally try to iterate an input only once.

Well, those reasons, plus the fact that we do occasionally encounter
one-shot Iterables. We discourage them, but when our methods are used so
often (e.g., ImmutableList.copyOf(Iterable)), we end up needing to
accommodate them. It does look like we have at least one lesser-used Guava
method that iterates multiple times, presumably because no one has called
it with a one-shot Iterable. We'll certainly change it if Stream becomes an
Iterable:
https://github.com/google/guava/blob/74fc49faf283f106302794f7af82c7ab1fcb5412/guava/src/com/google/common/collect/Range.java#L452

Our biggest single concern will be in Truth. It's the concern that Stuart
called out in his original proposal:
assertThat(stream).containsExactly(...). We actually already avoid
iterating the inputs multiple times (again because of occasional problems)
but currently only in *some* of our methods. We will have to extend that
support to the other methods -- which is doable, either by cleverness or by
brute-force copying. I will have to look into the performance of
brute-force copying. And I guess we'll have to hope that no one passes an
infinite stream :)

I do suspect that brute-force copying may be necessary. That is, I don't
think that `instanceof IterableOnce` or even Peter Levart's trick (
http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-March/059060.html)
is enough. That's primarily because of "view Iterables," especially our
Iterables.concat(Iterable<Iterable<T>>) (
https://google.github.io/guava/releases/snapshot-jre/api/docs/com/google/common/collect/Iterables.html#concat-java.lang.Iterable-).
That method (and its callers) can't tell whether it's produced a one-shot
Iterable until its caller has fully iterated its inputs. (In principle, it
doesn't necessarily even know then, since the inputs could change (though
of course that's a whole new can of worms).) I don't know if this will be a
problem in practice, but I'd at least expect some one-shot Iterables that
fail to impement IterableOnce. Nevertheless, IterableOnce is probably
worthwhile as *documentation* (and for some static analysis). I'm just
unsure whether we'll rely on it at *runtime*.

Another part of the story is that Truth has two parts to its operation:

1. test whether the Iterable contains the expected elements
2. report a failure

Above, I've focused mostly on #1. For #2, we historically have been willing
to iterate over the inputs a second time. That's worked tolerably. That's
because the occasional one-shot Iterables we've encountered typically
"support" multiple iterator() calls (by making later calls return the same
iterator instance) or because they return a toString() that doesn't try to
access the contents -- or because those tests just haven't failed yet! All
of those scenarios are likely to result in a subpar failure message for
users. (And in fact we have a special assertThat(Stream) method, partly
because Stream doesn't currently implement Iterable now but also partly so
that we can eagerly collect the values to a list for multiple iteration,
including in failure messages.)

This hints at one potential class of edge cases: methods that iterate twice
*sometimes*. I could imagine seeing this with logging or error messages.
However, my guess is that most callers will log the default
Stream.toString(), so there will be no issue -- other than that the user
may have expected to get "real" data, since the average Iterable is usually
a Collection with an informative toString(). There might be other odd
cases, like a method that accepts two Iterables and doesn't iterate the one
if the other is empty. And of course there might be breakages at a distance
when a method switches from returning a multiple-use Iterable to a one-shot
Iterable. I don't think the world will catch fire, but hopefully this adds
to the overall picture.

Another tiny point: We also find ourselves in the business of maintaining
compatibility with older versions of Java for a while. I would expect for
"passing a Stream to a method that accepts an Iterable" to be an
incompatible change that's not caught by current enforcement tools (like
Animal Sniffer), since Stream isn't a new type and the Iterable-accepting
method isn't a new method. (I haven't verified this.) This could be fixed
(and it's potentially a problem anywhere that you introduce a new
supertype), but I would expect sporadic problems at scale.

The other information that I *wish* I had is the other side of the picture:
I can offer anecdotes about Iterable-accepting methods, but I have few
anecdotes about foreach over Streams. I do have some, as we have an
internal Stream->Iterable adapter. And in fact it's now enforced by static
analysis to be used only with foreach, since we found that nearly all the
first batch of usages were for Iterable-accepting methods, rather than for
foreach! (Interestingly, that was the case even though those users could
use plain stream::iterator.) Again, Google may be unusual in having so many
Iterable-accepting methods -- though perhaps "Stream implements Iterable"
will encourage that pattern.

I did look through some of those users. As you'd expect, many *could* be
rewritten to avoid Stream entirely. (Some arguably *should*, even setting
foreach aside, as they're using Stream in ways I've seen discouraged, like
for streaming RPCs.) But this doesn't get at the heart of the question: How
often do people really need foreach over a Stream, *and* how often are
people annoyed to have to switch from the Stream model to the foreach model
when they *could* but shouldn't really have to? (Certainly we hear
complaints, as you do.) Good data could help weigh the benefits against the
costs. But all the data I have is "Google was using a Stream->Iterable
adapter more for Iterable-accepting methods than for foreach," and I don't
know how representative we are, nor do I know how many people gave up and
abandoned Stream without finding the adapter.

To recap: It would be nice to be able to compare the benefits to foreach
with the costs to Iterable-accepting methods. I have more to say about
Iterable-accepting methods: Some will definitely fail when passed a Stream;
others will *sometimes* fail when passed a Stream. These problems are
fixable but might require brute-force eager copying. "Stream implements
Iterable" might encourage more Iterable-accepting methods, and there's some
chance of compatibility concerns or unexpectedly bad toString(). None of
these concerns are fatal, but I haven't seen much discussion of them other
than in Stuart's mention of AssertJ, so I wanted to weigh in. Sorry for the
length.

Reply via email to