On Sun, Aug 25, 2013 at 11:34 AM, Florian Weimer <[email protected]> wrote:

> I suppose the routines you look at can only correctly deal with a
> restricted set of inputs (probably those generated by the
> corresponding binary-to-decimal conversion).  But they fail on
> arbitrary precision decimal input.
>

Incorrect.

Florian: stop looking at examples and start looking at the numerical
analysis of the actual conversion. This is a precise conversion. It's
expected that they fail on arbitrary precision decimal input. The decimal
input is arbitrary precision, the floating point input is not.


> Here's why I think my counterexample is correct.
>
> The two floating point values have the IEEE representation
> 3FEFFFFFFFFFFFFD and 3FEFFFFFFFFFFFFE.
>

OK. Let's work the example. Please start by telling me which floating point
values you are using, and which representation you are using when you give
me these hex values. Different hardware places the mantissa and sign in
different positions, so what I'm asking is which bits of your hex
representation are the exponent, sign, and mantissa.

Also, could you please stop complicating the exponents? Mixing bases and
putting arithmetic in the exponents is just making things obfuscated. You
are skipping lots and lots of steps, and we need to do this one step at a
time.

Computations using rational numbers are irrelevant here - you have extra
precision in the individual numbers, and this alters the results of the
multiplications.

I don't have time to re-derive the formal proof of correctness for
precision. It's been done several times, and it's published in the
literature. The bottom line is that if you require K bits of precision, and
you have k+3 bits of precision (which my method *does* have), you have more
than enough bits to round correctly, and that's all you need.


Jonathan


>
> 3FEFFFFFFFFFFFFD is the rational number 2**(0x3fd - 1023) *
> 0x1ffffffffffffd * 2**(-52).  I'm using CLISP's rational
> arithmetic (which is exact):
>
> (setq D
>   (* (expt 2 (- #x3fe 1023)) #x1ffffffffffffd (expt 2 -52)))
>
> CLISP evalutes this to 9007199254740989/9007199254740992.
>
> Similarly, 3FEFFFFFFFFFFFFE is this rational number:
>
> (setq E
>   (* (expt 2 (- #x3fe 1023)) #x1ffffffffffffe (expt 2 -52)))
>   --> 4503599627370495/4503599627370496
>
> 0.999999999999999722444243843710864894092082977294921875
> is the rational number
>
> 999999999999999722444243843710864894092082977294921875/1000000000000000000000000000000000000000000000000000000:
>
> (setq IN875
>
> 999999999999999722444243843710864894092082977294921875/1000000000000000000000000000000000000000000000000000000)
>   --> 18014398509481979/18014398509481984
>
> 0.99999999999999972244424384371086489409208297729492187 is this:
>
> (setq IN87
>
> 99999999999999972244424384371086489409208297729492187/100000000000000000000000000000000000000000000000000000)
>
> The distance between IN87 and D, E is
>
> (abs (- IN87 D))
>   -->
>  
> 2775557561562891351059079170227050781/50000000000000000000000000000000000000000000000000000
> (abs (- IN87 E))
>   -->
> 5551115123125782702118158340454101563/100000000000000000000000000000000000000000000000000000
>
> The distance to D is smaller:
>
> (< (abs (- IN87 D)) (abs (- IN87 E)))
>   --> T
>
> For IN875, the picture is this:
>
> (abs (- IN875 D))
>   --> 1/18014398509481984
> (abs (- IN875 E))
>   --> 1/18014398509481984
>
> So IN875 is right in the middle between D and E, and the correct
> conversion result depends on the rounding mode.  This is rather
> unexpected.  (But not too surprising in retrospect, considering how I
> came up with the number.)  I guess it explains why you got different
> results when converting this number.
>
> So let's introduce another number,
> 0.999999999999999722444243843710864894092082977294921876:
>  1000000000000000000000000000000000000000000000000000000
>
> (setq IN876
>
> 999999999999999722444243843710864894092082977294921876/1000000000000000000000000000000000000000000000000000000)
>   -->
> 249999999999999930611060960927716223523020744323730469/250000000000000000000000000000000000000000000000000000
>
> The distances are:
>
> (abs (- IN876 D))
>   -->
> 27755575615628913510590791702270507813/500000000000000000000000000000000000000000000000000000
> (abs (- IN876 E))
>   -->
> 6938893903907228377647697925567626953/125000000000000000000000000000000000000000000000000000
>
> This time, the distance to E is shorter:
>
> (< (abs (- IN876 D)) (abs (- IN876 E)))
>   --> nil
>
> So, to recap, the correct conversion for
> 0.99999999999999972244424384371086489409208297729492187 is
> 0.9999999999999997, and for
> 0.999999999999999722444243843710864894092082977294921876, it's
> 0.9999999999999998.  For
> 0.999999999999999722444243843710864894092082977294921875, either
> result is correct, depending on the rounding mode.  These numbers
> agree in the first 53 decimal places, not just the first 18
> representable in a 64-bit integer.
>
> I suppose if you have shorter inputs, using 64-bit arithmetic can work
> (but I'd rather have a formal proof of that).  But you still need
> multi-precision arithmetic for longer inputs.  You cannot simply
> truncate the number by discarding all the remaining digits once you
> hit the 64-bit overflow.
> _______________________________________________
> bitc-dev mailing list
> [email protected]
> http://www.coyotos.org/mailman/listinfo/bitc-dev
>
>
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to