Re: [HACKERS] [BUG] Denormal float values break backup/restore

Jeroen Vermeulen Mon, 20 Jun 2011 13:57:59 -0700

On 2011-06-20 19:22, Marti Raudsepp wrote:

AIUI that is defined to be a little vague, but includes denormalized numbers
that would undergo any rounding at all.  It says that on overflow the
conversion should return the appropriate HUGE_VAL variant, and set ERANGE.
  On underflow it returns a reasonably appropriate value (and either may or
must set ERANGE, which is the part that isn't clear to me).


Which standard is that? Does IEEE 754 itself define strtod() or is
there another relevant standard?


Urr.  No, this is C and/or Unix standards, not IEEE 754.

I did some more research into this. The postgres docs do specify therange error, but seemingly based on a different interpretation ofunderflow than what I found in some of the instances of strtod()documentation:


    Numbers too close to zero that are not representable as
    distinct from zero will cause an underflow error.

This talks about denormals that get _all_ their significant digitsrounded away, but some of the documents I saw specify an underflow fordenormals that get _any_ of their significant digits rounded away (andthus have an abnormally high relative rounding error).

The latter would happen for any number that is small enough to bedenormal, and is also not representable (note: that's not the same thingas "not representable as distinct from zero"!). It's easy to getnon-representable numbers when dumping binary floats in a decimalformat. For instance 0.1 is not representable, nor are 0.2, 0.01, andso on. The inherent rounding of non-representable values producesweirdness like 0.1 + 0.2 - 0.3 != 0.

I made a quick round of the strtod() specifications I could find, andthey seem to disagree wildly:


      Source                    ERANGE when             Return what
---------------------------------------------------------------------
PostgreSQL docs                 All digits lost         zero
Linux programmer's manual       All digits lost         zero
My GNU/Linux strtod()           Any digits lost         rounded number
SunOS 5                         Any digits lost         rounded number
GNU documentation               All digits lost         zero
IEEE 1003.1 (Open Group 2004)   Any digits lost         denormal
JTC1/SC22/WG14 N794             Any digits lost         denormal
Sun Studio (C99)                Implementation-defined  ?
ISO/IEC 9899:TC2                Implementation-defined  denormal
C99 Draft N869 (1999)           Implementation-defined  denormal

We can't guarantee very much, then. It looks like C99 disagrees withthe postgres interpretation, but also leaves a lot up to the compiler.


I've got a few ideas for solving this, but none of them are very good:

(a) Ignore underflow errors.

This could hurt anyone who relies on knowing their floating-pointimplementation and the underflow error to keep their rounding errors incheck. It also leaves a kind of gap in the predictability of thedatabase's floating-point behaviour.

Worst hit, or possibly the only real problem, would be algorithms thatdivide other numbers, small enough not to produce infinities, by roundeddenormals.


(b) Dump REAL and DOUBLE PRECISION in hex.

With this change, the representation problem goes away and ERANGE wouldreliably mean "this was written in a precision that I can't reproduce."We could sensibly provide an option to ignore that error forcross-platform dump/restores.

This trick does raise a bunch of compatibility concerns: it's a newformat of data to restore, it may not work on pre-C99 compilers, and soon. Also, output for human consumption would have to differ frompg_dump output.


(c) Have pg_dump produce calculations, not literals, for denormals.

Did I mention how these were not great ideas? If your database dumpcontains 1e-308, pg_dump could recognize that this value can becalculated in the database but possibly not entered directly, and dumpe.g. "1e-307::float / 10" instead.


(d) Make pg_dump set some "ignore underflows" option.

This may make dumps unusable for older postgres versions. Moreover, itdoesn't help ORMs and applications that are currently unable to storethe "problem numbers."


(e) Do what the documentation promises.

Actually I have no idea how we could guarantee this.

(f) Ignore ERANGE unless strtod() returns ±0 or ±HUGE_VAL.

This is probably a reasonable stab at common sense. It does have thenasty property that it doesn't give a full guarantee either way:restores could still break on pre-C99 systems that return 0 onunderflow, but C99 doesn't guarantee a particularly accurate denormal.In practice though, implementations seem to do their best to give youthe most appropriate rounded number.



Jeroen

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUG] Denormal float values break backup/restore

Reply via email to