[issue9009] Improve quality of Python/dtoa.c

2014-02-16 Thread Mark Dickinson

Changes by Mark Dickinson dicki...@gmail.com:


--
resolution:  - wont fix
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2012-04-29 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Dropping this due to lack of time;  unless anyone else wants to pick it up, it 
should probably be closed as won't fix.

--
assignee: mark.dickinson - 
priority: normal - low

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2011-12-01 Thread Rick Regan

Rick Regan exploringbin...@gmail.com added the comment:

   if (!(dig = quorem(b,d))) {
   b = multadd(b, 10, 0);  /* very unlikely */
   dig = quorem(b,d);
   }
 
 This code is part of the algorithm for strtod.  Here b and d are
 Bigints, and b / d is a fraction that gives an approximation to 
 the value of the input to strtod;  the aim is to produce the 
 digits of b / d one-by-one to compare them with the strtod input,
 and (eventually) use the result of that comparison work out whether
 to round up or down.

 If the condition of the 'if' block above is ever satisfied, b is 
 multiplied by 10 (that's the multadd(b, 10, 0) call), so the 
 fraction b / d is multiplied by 10 (with no corresponding correction
 for the strtod input string), and the wrong comparison is made!

Mark,

I think I know the motivation for this code, although I still don't know how it 
could hit. The halfway value H is scaled by a power of ten to put it in the 
form d1.d2d3d4d5 The power of ten exponent is derived from the input 
decimal string S, instead of computing it from H using logarithms.

Now what if H's exponent does not match S's? I'm thinking of cases like S = 
10^n and H = 9.... * 10^(n-1). Scaling H by 10^-n would make it 
0.9... . That leading 0 needs to be removed, by multiplying by 10, do 
put it in the right form.

First of all, I don't know if a case like this is possible. Second of all, the 
check would fail either way (1 against 0 vs 1 against 9).
 
BTW, b / d represents only the significant digits of H, so it shouldn't matter 
that there's no corresponding adjustment to the input string.

To summarize: I'm not saying this code is necessary; I'm just saying it makes 
you wonder.

--
nosy: +DoctorBinary

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-08-08 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Version of the rewrite_strtod patch applied in r83813.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-07-06 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Second version of the strtod rewrite;  has some additional documentation and 
comment fixes.  No other significant changes from the first version.  This is 
still a work in progress.

--
Added file: http://bugs.python.org/file17883/rewrite_strtod_v2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-07-06 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

r82614: add functionality to change FPU rounding mode (via float.__setround__ 
and float.__getround__ functions), on platforms that support the standard C99 
fesetround and fegetround functions:

 float.__getround__()
'tonearest'
 1e300 * 1e300
inf
 float.__setround__(downward)
 1e300 * 1e300
1.7976931348623157e+308

This is just temporary, so that I can test that FPU rounding mode doesn't 
affect results of string-to-float and float-to-string conversions.  I'm not 
planning to merge any of r82614 back to py3k.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-24 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Another issue to consider, brought to my attention by Rick Regan:  

Python's dtoa.c code likely misbehaves (i.e., returns incorrectly rounded 
results) if the FPU rounding mode is anything other than the round-half-to-even 
default.  (This is also true of Gay's original dtoa.c, I suspect.)  For 
example, the quick-return path in strtod does a single floating-point operation 
on exact arguments, so will end up behaving according to the FPU rounding mode. 
 The long integer-arithmetic-based path will likely return round-half-to-even 
results, independently of the FPU rounding mode.

It's debatable what Python should do if the FPU rounding mode is something 
other than round-half-to-even.  It can either:

 - try to honour the FPU rounding mode, or
 - ignore the rounding mode completely, always doing round-half-to-even.

I'd prefer the latter behaviour, for various reasons:

 - it maintains consistency across platforms

 - it's consistent with many other Python operations, that already do 
round-half-to-even regardless of FPU rounding mode---examples include 
float.fromhex and float.hex, true division of integers, the round() function...

It seems possible that Python might one day want to be able to control the 
rounding direction of decimal - binary conversions, but when that day comes I 
don't think playing with the FPU rounding mode is going to be the best control 
mechanism.

I don't regard this as a terribly serious issue, though:  most normal users are 
unlikely to end up in a situation where the FPU rounding mode has changed 
unless they've been deliberately and knowingly messing with FPU settings.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-24 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

 - ignore the rounding mode completely, always doing round-half-to-even.
+1

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-24 Thread Eric Smith

Eric Smith e...@trueblade.com added the comment:


 Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

 - ignore the rounding mode completely, always doing round-half-to-even.
 +1

Agreed. +1

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-24 Thread Stefan Krah

Stefan Krah stefan-use...@bytereef.org added the comment:

+1, if the FPU mask is always restored (as I understand, this is the case
now).

--
nosy: +skrah

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-19 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

r82080: Whitespace fixes.
r82081: In strtod, simplify the computation of the initial approximation.
r82082: Fix typo introduced in r82025 (I think);  this was preventing
   a shortcut from being taken.

r82087: Simplify the ratio function.  The previous ratio function (actually, 
b2d), aborted if the numerator was zero, and the current code ends up requiring 
special cases for zero as a result of this.  That restriction is now removed, 
which will allow further simplifications (to come) in strtod.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-19 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Patch that does a fairly significant rewrite of strtod;  it's still (mostly) 
based on Gay's code, but there are some significant changes.

Some background:  strtod, after dealing with easy cases, works roughly as 
follows:

(1) Using floating-point arithmetic, create a double *rv* holding an 
approximation to the input value; this approximation may be out from the true 
value by several ulps (perhaps as much as 10 ulps;  certainly not more than 100 
ulps).

(2) If the input string is very long, truncate it (accepting that this 
introduces a small error), and work with the truncated value below.

(3) Use integer arithmetic to compute (an approximation to) ulps difference 
between *rv* and true value.  Possibly return immediately if
the ulps difference can be proved to be = 0.5 ulps, and we're not in any of 
various exceptional cases.

(4) Use the ulps difference to correct *rv* to a new value.

(5) If the ulps difference has fractional part close to 0.5, or if the 
correction takes us past a power of 2, or if it takes use near/to the max 
representable double, or to 0.0, go around the correction loop again.

(6) If we still can't decide (because the ulps difference is very close to 
0.5), call bigcomp to settle the issue once and for all.


The new patch simplifies the above procedure considerably:

- scaling of rv is used for very large values as well as very small ones; this 
simplifies handling of overflow, meaning that there's only a single place where 
overflow has to be detected.

- the adjustment step handles adjustments that cross power-of-2 boundaries 
correctly.

- as a result of the above two simplifications, there's never any need to do a 
second correction step, so the main correction loop is no longer a loop; a 
single correction is performed.

- we always use the bigcomp function in hard cases, so there's no longer any 
need for the computation of ulps_difference to detect the case where the error 
is exactly 0.5 ulps.


The patch isn't quite ready to apply;  I want to expand some of the comments a 
little.

--
Added file: http://bugs.python.org/file17720/rewrite_strtod.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-18 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

r82079:  Apply a version of the parsing patch to pull the parsing code
   out into a separate function.

Alexander, I agree about the names;  I'll do a mass renaming later on.  I'm 
trying not to mix the significant algorithm-changing commits with trivial 
renaming/reindenting/commenting commits, to make it easier to review each 
independent change.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-18 Thread Eric Smith

Eric Smith e...@trueblade.com added the comment:

It would be easier for me to review if you did it in the other order: fix the 
variable names first.

Although I'm still pretty busy and won't have much time to review, so my needs 
shouldn't be your first priority.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Mark Dickinson

New submission from Mark Dickinson dicki...@gmail.com:

I've opened this issue to track efforts to improve the quality of the 
Python/dtoa.c file, which provides Python's string-to-float and float-to-string 
conversions.

Particular issues in mind (non-exhaustive):

 - do a thorough review and test of _Py_dg_dtoa;  this has already
   been done for _Py_dg_strtod (issue 7632, issue 7743), and uncovered
   several problems, including a memory leak, some asserts that were
   triggered in debug mode, and many cases of wrong output.   

 - break out the parsing code from _Py_dg_strtod into a separate
   function, for clarity and maintainability (and possible re-use
   of the parsing code itself)

 - improve _Py_dg_strtod tests, by using contined-fractions to generate
   and test particularly difficult cases.

 - _Py_dg_strtod silently gives wrong results for huge inputs;  while
   not particular serious, this is easily avoidable.

 - improve API to rely on errno less.

 - some pieces of code are currently unused;  improve code coverage
   and tests to identify thoses pieces and remove them.

 - the current code is convoluted in places and hard to explain;
   at least some extra comments should be added.

 - try to make Python/dtoa.c as Python-agnostic as possible, so that
   the code can be reused in other projects where desired.

--
assignee: mark.dickinson
components: Interpreter Core
messages: 107923
nosy: eric.smith, mark.dickinson
priority: normal
severity: normal
status: open
title: Improve quality of Python/dtoa.c
versions: Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Alexander Belopolsky

Changes by Alexander Belopolsky belopol...@users.sourceforge.net:


--
nosy: +belopolsky

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Created new 'py3k-dtoa' branch for this work in r82024.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

A couple of preparatory commits:

r82025: In _Py_dg_strtod, 'e' now represents the adjusted exponent rather than 
the base exponent;  that is, the input value is of the form +- m * 10**e with 
0.1 = m  1.0.  It's easier to produce such an 'e' from the parsing stage if 
we care about detecting overflow and underflow.

r82031: Update the s2b function:  remove a premature optimization in order to 
make s2b more general and its correctness more easily verifiable; alter the way 
that the input string is parsed so that it doesn't depend on nd0 being in the 
range [0, nd].

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

And here's a patch to pull out the parsing stage of _Py_dg_strtod into a 
separate function.

--
keywords: +patch
Added file: http://bugs.python.org/file17688/dtoa_parsing.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

r82032: Commit some additional tests for test_strtod.py.

test_extra_long_significand will currently fail;  with the dtoa_parsing patch, 
it passes.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Mark Dickinson

Changes by Mark Dickinson dicki...@gmail.com:


--
stage:  - patch review
type:  - feature request

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Eric Smith

Eric Smith e...@trueblade.com added the comment:

I realize the ship's already sailed on this issue [1], but isn't it a problem 
that editing this code makes it more difficult to apply patches from Gay's 
original code? What do we do if Gay releases a new version of his code with bug 
fixes?


[1] Sorry non-native American-English speakers. I mean that we already have 
this problem since we've already modified the code.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Gay's changes tend to be very small;  any bugfixes he releases can likely be 
applied by hand, if they're relevant.

I did originally want to keep close to Gay's code, but frankly I'm not very 
happy with the quality of that code;  and in communications with Gay it's 
become clear that there are issues that will not be fixed upstream, but that I 
consider unacceptable for Python's copy of dtoa.c.

Some examples of problems with the original code:

(1) Gay's code does no checking of return values from malloc.  We had to add 
those checks, which was the first point at which our code started diverging 
from his.

(2) There's a segment at the beginning of the bigcomp function that's 
unnecessary, and in fact would produce incorrect results if it were ever 
called;  it's just about possible to show that it *can't* ever be called.  I've 
asked David Gay about this, but he insists that it's necessary.  (I've removed 
it in Python's version of the code.)

(3) The original code silently produces wrong results for huge inputs (more 
than 2 characters);  I know this is an extreme use case, but again I find 
this unacceptable for Python.  (I haven't asked Gay about this;  I'd be very 
surprised if he wanted to do anything about it, though.)

(4) The original code is horribly convoluted in places, making it very 
difficult to check for correctness.  (For example, see the spaghetti mess in 
the parsing section of strtod.c).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Here's the section of the 'bigcomp' code that I was referring to above:

/* Now b/d = exactly half-way between the two floating-point values */
/* on either side of the input string.  Compute first digit of b/d. */

if (!(dig = quorem(b,d))) {
b = multadd(b, 10, 0);  /* very unlikely */
dig = quorem(b,d);
}

You can see it in the original source at http://www.netlib.org/fp/dtoa.c

This code is part of the algorithm for strtod.  Here b and d are Bigints, and b 
/ d is a fraction that gives an approximation to the value of the input to 
strtod;  the aim is to produce the digits of b / d one-by-one to compare them 
with the strtod input, and (eventually) use the result of that comparison work 
out whether to round up or down.

If the condition of the 'if' block above is ever satisfied, b is multiplied by 
10 (that's the multadd(b, 10, 0) call), so the fraction b / d is multiplied by 
10 (with no corresponding correction for the strtod input string), and the 
wrong comparison is made!

There are many other similar pieces of code in _Py_dg_strtod that can never get 
called (I've run coverage programs over the code to help verify this);  trying 
to establish the correctness of the current code isn't easy.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Eric Smith

Eric Smith e...@trueblade.com added the comment:

Just to be clear: I'm okay with this divergence, as long as we've made it
clear we're explicitly doing so and we've given our reasons. Mark's done
that, and of course he's the expert in the subject.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Here's an updated version of the parsing patch, with rewritten comments, but no 
significant code changes.

--
Added file: http://bugs.python.org/file17689/dtoa_parsing2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9009] Improve quality of Python/dtoa.c

2010-06-16 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

Mark,

It is great to see you doing this.  I looked at this code on several occasions 
before and each time ran away scared!  I sincerely hope I will understand how 
it works after your rewrite.

Just a small suggestion at this point: can you give longer names to args and 
local variables?  The current alphabet soup is a bit confusing:

c, s, s0, s1, se, s00(!), e, nd, nd0, pnd, pnd0 ...

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9009
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com