On Jan21, 2014, at 10:53 , David Rowley <dgrowle...@gmail.com> wrote: > It came to me that it might be possible to implement inverse transitions > for floating point aggregates by just detecting if precision has been > lost during forward transitions. > > I've written the test to do this as: > > IF state.value + value = state.value AND value <> 0 THEN > newstate.precision_lost := true; > newstate.value := state.value; > ELSE > newstate.precision_lost := false; > newstate.value := state.value + value; > END IF; > > > The inverse transition function checks the precision_lost and if it's true it > returns NULL. The core code is now implemented (thanks to Florian) to > re-aggregate when NULL is returned from the inverse transition function.
That's not sufficient, I fear. You can lose all significant digits of the value and still have precision_lost = false afterwards. Try summing over 1e16, 1.01. "SELECT 1e16::float8 + 1.01::float8 = 1e16::float8" returns FALSE, yet "SELECT 1e16::float8 + 1.01::float8 - 1e16::float8" returns "2" where "1.01" would have been correct. That's still too much precision loss. I'm quite certain that the general idea has merit, but the actual invertibility condition is going to be more involved. If you want to play with this, I think the first step has to be to find a set of guarantees that SUM(float) is supposed to meet. Currently, SUM(float) guarantees that if the summands all have the same sign, the error is bound by C * N, where C is some (machine-specific?) constant (about 1e-15 or so), and N is the number of input rows. Or at least so I think from looking at SUMs over floats in descending order, which I guess is the worst case. You could, for example, try to see if you can find a invertibility conditions which guarantees the same, but allows C to be larger. That would put a bound on the number of digits lost by the new SUM(float) compared to the old one. I don't have high hopes for this getting int 9.4, though. > If it seems sound enough, then I may implement it in C to see how much > overhead it adds to forward aggregation for floating point types, but even > if it did add too much overhead to forward aggregation it might be worth > allowing aggregates to have 2 forward transition functions and if the 2nd > one exists then it could be used in windowing functions where the frame > does not have "unbounded following". I don't think adding yet another type of aggregation function is the solution here. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers