On Sun, Aug 25, 2013 at 11:34 AM, Florian Weimer <[email protected]> wrote:
> I suppose the routines you look at can only correctly deal with a > restricted set of inputs (probably those generated by the > corresponding binary-to-decimal conversion). But they fail on > arbitrary precision decimal input. > Incorrect. Florian: stop looking at examples and start looking at the numerical analysis of the actual conversion. This is a precise conversion. It's expected that they fail on arbitrary precision decimal input. The decimal input is arbitrary precision, the floating point input is not. > Here's why I think my counterexample is correct. > > The two floating point values have the IEEE representation > 3FEFFFFFFFFFFFFD and 3FEFFFFFFFFFFFFE. > OK. Let's work the example. Please start by telling me which floating point values you are using, and which representation you are using when you give me these hex values. Different hardware places the mantissa and sign in different positions, so what I'm asking is which bits of your hex representation are the exponent, sign, and mantissa. Also, could you please stop complicating the exponents? Mixing bases and putting arithmetic in the exponents is just making things obfuscated. You are skipping lots and lots of steps, and we need to do this one step at a time. Computations using rational numbers are irrelevant here - you have extra precision in the individual numbers, and this alters the results of the multiplications. I don't have time to re-derive the formal proof of correctness for precision. It's been done several times, and it's published in the literature. The bottom line is that if you require K bits of precision, and you have k+3 bits of precision (which my method *does* have), you have more than enough bits to round correctly, and that's all you need. Jonathan > > 3FEFFFFFFFFFFFFD is the rational number 2**(0x3fd - 1023) * > 0x1ffffffffffffd * 2**(-52). I'm using CLISP's rational > arithmetic (which is exact): > > (setq D > (* (expt 2 (- #x3fe 1023)) #x1ffffffffffffd (expt 2 -52))) > > CLISP evalutes this to 9007199254740989/9007199254740992. > > Similarly, 3FEFFFFFFFFFFFFE is this rational number: > > (setq E > (* (expt 2 (- #x3fe 1023)) #x1ffffffffffffe (expt 2 -52))) > --> 4503599627370495/4503599627370496 > > 0.999999999999999722444243843710864894092082977294921875 > is the rational number > > 999999999999999722444243843710864894092082977294921875/1000000000000000000000000000000000000000000000000000000: > > (setq IN875 > > 999999999999999722444243843710864894092082977294921875/1000000000000000000000000000000000000000000000000000000) > --> 18014398509481979/18014398509481984 > > 0.99999999999999972244424384371086489409208297729492187 is this: > > (setq IN87 > > 99999999999999972244424384371086489409208297729492187/100000000000000000000000000000000000000000000000000000) > > The distance between IN87 and D, E is > > (abs (- IN87 D)) > --> > > 2775557561562891351059079170227050781/50000000000000000000000000000000000000000000000000000 > (abs (- IN87 E)) > --> > 5551115123125782702118158340454101563/100000000000000000000000000000000000000000000000000000 > > The distance to D is smaller: > > (< (abs (- IN87 D)) (abs (- IN87 E))) > --> T > > For IN875, the picture is this: > > (abs (- IN875 D)) > --> 1/18014398509481984 > (abs (- IN875 E)) > --> 1/18014398509481984 > > So IN875 is right in the middle between D and E, and the correct > conversion result depends on the rounding mode. This is rather > unexpected. (But not too surprising in retrospect, considering how I > came up with the number.) I guess it explains why you got different > results when converting this number. > > So let's introduce another number, > 0.999999999999999722444243843710864894092082977294921876: > 1000000000000000000000000000000000000000000000000000000 > > (setq IN876 > > 999999999999999722444243843710864894092082977294921876/1000000000000000000000000000000000000000000000000000000) > --> > 249999999999999930611060960927716223523020744323730469/250000000000000000000000000000000000000000000000000000 > > The distances are: > > (abs (- IN876 D)) > --> > 27755575615628913510590791702270507813/500000000000000000000000000000000000000000000000000000 > (abs (- IN876 E)) > --> > 6938893903907228377647697925567626953/125000000000000000000000000000000000000000000000000000 > > This time, the distance to E is shorter: > > (< (abs (- IN876 D)) (abs (- IN876 E))) > --> nil > > So, to recap, the correct conversion for > 0.99999999999999972244424384371086489409208297729492187 is > 0.9999999999999997, and for > 0.999999999999999722444243843710864894092082977294921876, it's > 0.9999999999999998. For > 0.999999999999999722444243843710864894092082977294921875, either > result is correct, depending on the rounding mode. These numbers > agree in the first 53 decimal places, not just the first 18 > representable in a 64-bit integer. > > I suppose if you have shorter inputs, using 64-bit arithmetic can work > (but I'd rather have a formal proof of that). But you still need > multi-precision arithmetic for longer inputs. You cannot simply > truncate the number by discarding all the remaining digits once you > hit the 64-bit overflow. > _______________________________________________ > bitc-dev mailing list > [email protected] > http://www.coyotos.org/mailman/listinfo/bitc-dev > >
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
