Hi Ulf,
On 2018-09-27 16:40, Ulf Adams wrote:
Hi Raffaello,
I am the author of a recent publication on double to string conversion
[1] - the Ryu algorithm. I've been aware of the problems with the Jdk
for several years, and am very much looking forward to improvements in
correctness and performance in this area.
What a coincidence! I'm happy to hear that the quest for better
floating->string conversions has not stopped. Tomorrow I'll download
your paper and have a look at it during the weekend.
I have done some testing against my Java implementation of the Ryu
algorithm described in the linked paper. Interestingly, I've found a few
cases where they output different results. In particular:
1.0E-323 is printed as 9.9E-324
1.0E-322 is printed as 9.9E-323
If Ryu also produces 1 digit long outputs, then your results above are
correct. But then Ryu should also output 5.0E-324 rather than 4.9E-324,
for example.
Even better, it should output 5E-324, 1E-323 and 1E-322 because adding
the .0 part might confuse a human reader to believe that 2 digits are
really needed. But then 4.9E-324, 9.9E-324 and 9.9E-323 are closer to
the double.
2 digits are for backward compatibility with the existing spec which
requires at least one digit to the right of the decimal point.
It's likely that there are more such cases - I only ran a sample of
double-precision numbers. Arguably, 9.9 is the correctly rounded 2-digit
output and Ryu is incorrect here. That's what you get when you have a
special case for Java without a correctness proof. :-(
In terms of performance, this algorithm performs almost exactly the same
as my Java implementation of Ryu, although I'd like to point out that my
C implementation of Ryu is quite a bit faster (though note that it
generates different output, in particular, it only outputs a single
digit of precision in the above cases, rather than two), and I didn't
backport all the performance improvements from the Java version, yet. It
looks like this is not coincidence - as far as I can see so far, it's
algorithmically very similar, although it manages to avoid the loop I'm
using in Ryu to find the shortest representation.
I have a few comments:
* <li> It rounds to {@code v} according to the usual
round-to-closest
* rule of IEEE 754 floating-point arithmetic.
- Since you're spelling out the rounding rules just below, this is
duplicated, and by itself, it's unclear since it doesn't specify the
specific sub-type (round half even).
I tried to save as much of the original spec wording as possible.
Perhaps it isn't worthwhile.
- Naming: I'd strongly suggest to use variable names that relate to
what's stored, e.g., m for mantissa, e for exponent, etc.
I currently prefer to be consistent with a forthcoming paper of mine on
the subject. But thanks for the suggestion.
- What's not clear to me is how the algorithm determines how many digits
to print.
You'll have to wait for the paper.
- Also, it might be nicer to move the long multiplications to a helper
method - at least from a short look, it looks like the computations of
vn, vnl, and vnr are identical.
I tried several variants: the current one seems to be the faster with
the current optimizations of C2. Some day I'll also try with Graal.
- I looked through the spec, and it looks like all cases are
well-defined. Yay!
I will need some more time to do a more thorough review of the code and
more testing for differences. Unfortunately, I'm also traveling the next
two weeks, so this might take a bit of time.
I thank you in advance for your willingness to review the code but my
understanding is that only the officially appointed reviewers can
approve OpenJDK contributions, which is of course a good policy.
Besides, as two Andrews engineers from RedHat correctly observe,
understanding the rationale of the code without the planned accompanying
paper is hard.
I'm not a contributor to the Jdk, and this isn't my full-time job. I was
lurking here because I was going to send a patch for the double to
string conversion code myself (based on Ryu).
All my efforts on this projects are done in my unpaid spare time, too.
Thanks,
-- Ulf
> [1] https://dl.acm.org/citation.cfm?id=3192369
> [2] https://github.com/google/double-conversion
> [3] https://en.wikipedia.org/wiki/Rounding
>
Thank you
Raffaello