On 12 Mar 2015, at 11:25, Paul Sandoz <paul.san...@oracle.com> wrote:
> > On Mar 12, 2015, at 12:05 PM, Chris Hegarty <chris.hega...@oracle.com> wrote: > >> >> On 12 Mar 2015, at 09:44, Paul Sandoz <paul.san...@oracle.com> wrote: >> >>> >>> On Mar 11, 2015, at 1:45 PM, Aggelos Biboudis <bibou...@gmail.com> wrote: >>> >>>> Hi all, >>>> >>>> Please review the patch for the count terminal operator on SIZED streams. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8067969 >>>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8067969-optimize-stream-count/webrev/ >>>> >>>> Thanks Paul Sandoz for sponsoring this. >>>> >>> >>> This looks good. Code is nicely contained and not as much as i initially >>> anticipated. >> >> This does indeed look nice. >> >> One, trivial, question why call spliterator.getExactSizeIfKnown() and not >> spliterator.estimateSize() ? > > Because the latter might return an estimate and not the exact size. For > example, say we revamp the Files.lines for UTF-8 and optimize it, the root > spliterator could report the size of the file as an estimate of the number of > lines but it would not know the exact number of lines. OK, got it. >>> I am pondering adding an api note to the count methods to head off any >>> suprises as now the stream pipeline may not be executed. >> >> I think it would be good to add a note to the spec, as this could be >> surprising. >> >> So really this comes down to the type if intermediate operations, right? > > And what optimizations the implementation can do. > > >> For example, filter will always be executed: >> >> IntStream.of(1, 2, 3, 4).peek(System.out::println).filter(x -> true).count(); >> > > Yes. > > >> Should the note capture something about the type of the intermediate >> operations? >> > > How about: > > * @apiNote > * An implementation may choose to not execute the stream pipeline (either > * sequentially or in parallel) if it is capable of computing the count > * directly from the stream source. In such cases no source elements will > * be traversed and no intermediate operations will be evaluated. > * Behavioral parameters with side-effects, which are strongly discouraged > * except for harmless cases such as debugging, may be affected. For > * example, consider the following stream: > * <pre>{@code > * List<String> l = ... > * long count = l.stream().peek(System.out::println).count(); > * }</pre> > * The number of elements covered by the stream source, a {@code List}, is > * known and the intermediate operation, {@code peek}, does not inject into > * or remove elements from the stream (as may be the case for > * {@code flatMap} or {@code filter} operations). Thus the count is the > * size of the {@code List} and there is no need to execute the pipeline > * and, as a side-effect, print out the list elements. Looks good to me. > I want to tread lightly here and focus on operations that might legitimately > be used for harmless side-effects. Make sense. -Chris. > Paul.