DoubleSummaryStatistics

Peter Levart Wed, 12 Apr 2017 07:44:31 -0700

Hi Dennis,

On 04/11/2017 10:48 PM, Chris Dennis wrote:

Color me confused… what would the javadoc on the parameter say? It could I 
guess have an @implNote documenting the meanings of the parameters… but then 
what use is it? The proposed API simply limits the precision with which a 
DoubleSummaryStatistic can be copied to be the same as the precision with which 
it can be “accessed”.  That doesn’t seem an enormous problem since I suspect 
that bulk of usages would be to marshall a “finished” instance and therefore 
there is no real loss occuring. If we wanted to support arbitrary precision 
wouldn’t it be better to have a constructor variant that took a BigDecimal, and 
a matching getPreciseSum() that returned a BigDecimal?

And how would you compute the value for BigDecimal getPreciseSum() sothat you could then set 3 internal double fields from BigDecimal withoutloss of information?

I can imagine a "distributed" implementation of Stream that musttransport partial results across VM boundaries and then combine them. Itwould be bad if the results of such summation differed from defaultlocal Stream implementation. If we don't want to makeXxxSummaryStatistics objects Serializable themselves, then perhaps weneed some other API to support transporting the state over the wire withevolvability in mind. What about something like the following:

public DoubleSummaryStatistics(DataInput dataInput) throwsIOException {

        int version = dataInput.readByte();
        if (version == 1) {
            long count = dataInput.readLong();
            double sum = dataInput.readDouble();
            double sumCompensation = dataInput.readDouble();
            double simpleSum = dataInput.readDouble();
            double min = dataInput.readDouble();
            double max = dataInput.readDouble();
            // validation of arguments ...

            // ... assignment to fields
            this.count = count;
            this.sum = sum;
            this.sumCompensation = sumCompensation;
            this.simpleSum = simpleSum;
            this.min = min;
            this.max = max;
        } else {
            throw new IllegalArgumentException(
                "Invalid version in DataInput stream: " + version);
        }
    }

    public void writeTo(DataOutput dataOutput) throws IOException {
        dataOutput.writeByte(1);
        dataOutput.writeLong(count);
        dataOutput.writeDouble(sum);
        dataOutput.writeDouble(sumCompensation);
        dataOutput.writeDouble(simpleSum);
        dataOutput.writeDouble(min);
        dataOutput.writeDouble(max);
    }

Both members could be made protected so that a Serializable subclasscould be devised and they would not show up as public API.


Regards, Peter


Chris

On Apr 11, 2017, at 4:16 PM, joe darcy <[email protected]> wrote:

On an API design note, I would prefer to DoubleSummaryStatistics took a 
double... argument to pass in the state of the summation. This flexibility is 
necessary to more fully preserve the computed sum. Also, the we want 
flexibility to change the amount of internal state DoubleSummaryStats keeps so 
we don't want to hard-code that into as aspect of the API.

Thanks,

-Joe


On 4/11/2017 12:53 PM, Paul Sandoz wrote:

Hi Chris,

Thanks for looking at this.

There is some rudimentary testing using streams in 
CollectAndSummaryStatisticsTest.java.

I think we should enforce constraints where we reliably can:

1) count >= 0

2) count = 0, then min/max/sum are ignored and it’s as if the default 
constructor was called. (I thought it might be appropriate to reject since a 
caller might be passing in incorrect information in error, but the defaults for 
min/max are important and would be a burden for the caller to pass those 
values.) In this respect having count as the first parameter of the constructor 
would be useful.

3)  min <= max

Since for count > 0 the constraints, count * max >= sum, count * min <= sum, 
cannot be reliably enforced due to overflow i am inclined to not bother and just 
document.


Note this is gonna be blocked from pushing until the new Compatibility and 
Specification Review (CSR) process is opened up, which i understand is 
“soon"ish :-) but that should not block adding some further tests in the 
interim and tidying up the javadoc.

Paul.

On 6 Apr 2017, at 07:08, Chris Dennis <[email protected]> wrote:

Hi Paul (et al)

Like all things API there are wrinkles here when it comes to implementing.

This patch isn’t final, there appears to be no existing test coverage for these 
classes beyond testing the compensating summation used in the double 
implementation, and I left off adding any until it was decided how much 
parameter validation we want (since that’s the only testing that can really 
occur here).

The wrinkles relate to the issues around instances that have suffered overflow 
in count and/or sum which the existing implementation doesn’t defend against:

* I chose to ignore all parameters if 'count == 0’ and just leave the instance 
empty. I held off from validating min, max and count however. This obviously 
'breaks things’ (beyond how broken they would already be) if count == 0 only 
due to overflow.
* I chose to fail if count is negative.
* I chose to enforce that max and min are logically consistent with count and 
sum, this can break the moment either one of the overflows as well (I’m also 
pretty sure that the FPU logic is going to suffer here too - there might be 
some magic I can do with a pessimistic bit of rounding on the FPU 
multiplication result).

I intentionally went with the most aggressive parameter validation here to 
establish one end of what could be implemented.  There are arguments for this 
and also arguments for the other extreme (no validation at all).  Personally 
I’m in favor of the current version where the creation will fail if the inputs 
are only possible through overflow (modulo fixing the FPU precision issues 
above) even though it’s at odds with approach of the existing implementation.

Would appreciate thoughts/feedback.  Thanks.

Chris


P.S. I presume tests would be required for the parameter checking (once it is 
finalized)?

<8178117.patch>

Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Reply via email to