On 2018-10-04 15:28, Ulf Adams wrote: > On Thu, Sep 27, 2018 at 5:37 PM Raffaello Giulietti > <raffaello.giulie...@gmail.com <mailto:raffaello.giulie...@gmail.com>> > wrote: > > Hi Ulf, > > > On 2018-09-27 16:40, Ulf Adams wrote: > > Hi Raffaello, > > > > I am the author of a recent publication on double to string > conversion > > [1] - the Ryu algorithm. I've been aware of the problems with the Jdk > > for several years, and am very much looking forward to > improvements in > > correctness and performance in this area. > > > > What a coincidence! I'm happy to hear that the quest for better > floating->string conversions has not stopped. Tomorrow I'll download > your paper and have a look at it during the weekend. > > > Have you had a chance to take a look? > > (I'm traveling for the next ~10 days and at a conference, so don't > expect too much from me during that time.) > >
I had a cursory reading but couldn't dig deeper for now. If nothing unexpected happens, I should be able to study your paper during the weekend. > > > > > I have done some testing against my Java implementation of the Ryu > > algorithm described in the linked paper. Interestingly, I've found > a few > > cases where they output different results. In particular: > > 1.0E-323 is printed as 9.9E-324 > > 1.0E-322 is printed as 9.9E-323 > > If Ryu also produces 1 digit long outputs, then your results above are > correct. But then Ryu should also output 5.0E-324 rather than 4.9E-324, > for example. > Even better, it should output 5E-324, 1E-323 and 1E-322 because adding > the .0 part might confuse a human reader to believe that 2 digits are > really needed. But then 4.9E-324, 9.9E-324 and 9.9E-323 are closer to > the double. > > > The C version produces 1 digit long outputs, and I was trying to follow > the Java spec in the Java version, but the code to do so isn't quite > right. Unfortunately, I haven't yet been able to fix it. > > > > 2 digits are for backward compatibility with the existing spec which > requires at least one digit to the right of the decimal point. > > > > > > It's likely that there are more such cases - I only ran a sample of > > double-precision numbers. Arguably, 9.9 is the correctly rounded > 2-digit > > output and Ryu is incorrect here. That's what you get when you have a > > special case for Java without a correctness proof. :-( > > > > In terms of performance, this algorithm performs almost exactly > the same > > as my Java implementation of Ryu, although I'd like to point out > that my > > C implementation of Ryu is quite a bit faster (though note that it > > generates different output, in particular, it only outputs a single > > digit of precision in the above cases, rather than two), and I didn't > > backport all the performance improvements from the Java version, > yet. It > > looks like this is not coincidence - as far as I can see so far, it's > > algorithmically very similar, although it manages to avoid the > loop I'm > > using in Ryu to find the shortest representation. > > > > I have a few comments: > > > > * <li> It rounds to {@code v} according to the usual > > round-to-closest > > * rule of IEEE 754 floating-point arithmetic. > > - Since you're spelling out the rounding rules just below, this is > > duplicated, and by itself, it's unclear since it doesn't specify the > > specific sub-type (round half even). > > > > I tried to save as much of the original spec wording as possible. > Perhaps it isn't worthwhile. > > > > > - Naming: I'd strongly suggest to use variable names that relate to > > what's stored, e.g., m for mantissa, e for exponent, etc. > > > > I currently prefer to be consistent with a forthcoming paper of mine on > the subject. But thanks for the suggestion. > > > May I suggest that the paper also uses names that relate to what they're > referring to? :-) Not that I've managed to do that very well myself... > I tend to use short "mathematical" names that still evoke their semantics. Will see if I manage to be consistent. > > > > > - What's not clear to me is how the algorithm determines how many > digits > > to print. > > > > You'll have to wait for the paper. > > > Looking forward to it. I tried to reverse engineer the code, but it's > far from obvious. > I don't think it is reversible, even for knowledgeable people like you :-( I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... > > > > > - Also, it might be nicer to move the long multiplications to a > helper > > method - at least from a short look, it looks like the > computations of > > vn, vnl, and vnr are identical. > > > > I tried several variants: the current one seems to be the faster with > the current optimizations of C2. Some day I'll also try with Graal. > > > Sure, but moving it to a method shouldn't affect performance (except if > you need to return multiple values), and, right now, it looks like > identical code. > > Where possible, I try not to rely on C2 to perform inlining of "long" code like the 8 lines for the multiplication. In the end, it is repeated only 3 times there and in a limited space. It shouldn't come to a surprise to a reader. I'll certainly add a comment to warn the reader that the code is the same and I'll retry my experiments with extracting a method for the multiplication although it currently seems more complex than at first sight. Unfortunately, Java does not yet support 128 bits quantities. Greetings Raffaello > > > > > - I looked through the spec, and it looks like all cases are > > well-defined. Yay! > > > > I will need some more time to do a more thorough review of the > code and > > more testing for differences. Unfortunately, I'm also traveling > the next > > two weeks, so this might take a bit of time. > > > > I thank you in advance for your willingness to review the code but my > understanding is that only the officially appointed reviewers can > approve OpenJDK contributions, which is of course a good policy. > Besides, as two Andrews engineers from RedHat correctly observe, > understanding the rationale of the code without the planned > accompanying > paper is hard. > > > > > I'm not a contributor to the Jdk, and this isn't my full-time job. > I was > > lurking here because I was going to send a patch for the double to > > string conversion code myself (based on Ryu). > > > > All my efforts on this projects are done in my unpaid spare time, too. > > > > Thanks, > > > > -- Ulf > > > > [1] https://dl.acm.org/citation.cfm?id=3192369 > > [2] https://github.com/google/double-conversion > > [3] https://en.wikipedia.org/wiki/Rounding > > > > > Thank you > Raffaello > > > > > >