Hello,

Please take an initial look over a fix for

JDK-8006572 DoubleStream.sum() & DoubleSummaryStats implementations that reduce numerical errors
    http://cr.openjdk.java.net/~darcy/8006572.0/

The basic approach is to use compensated summation

    http://en.wikipedia.org/wiki/Kahan_summation_algorithm

to computed streams-related sum and average statistics in the various locations that this can be done.

All existing streams tests pass and new newly-written test passes too.

I believe the DoubleSummaryStatistics.java portion, including the test, is fully review-worthy. In the test, for the sample computation in question, the naive summation implementation had a error of 500,000 ulps compared to 2 ups with the new implementation.

Two other locations I've found where this summation technique should be used are in

    java.util.stream.Collectors.{summingDouble, averagingDouble}

and

    java.util.stream.DoublePipeline.{sum, average}

DoublePipeline is the primary implementation class of DoubleStream.

For Collectors, the proposed code is a fairly clear adaptation of how the current code passes state around; there is not currently a dedicated test for the new summation technique in this location.

I'm new to the streams API so for DoublePipeline I don't know the idiomatic way to phrase the collect I want to perform over the code. (Based on my current understanding, I believe I want to perform a collect rather than a reduce since for the compensated summation I need to maintain some additional state.) Guidance here welcome.

Thanks,

-Joe

Reply via email to