subject:"\[PATCH\] 4511638\: Double.toString\(double\) sometimes produces incorrect results"

RE: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-06-17 Thread Anthony Vanelverdinghe

My bad. I was looking at the patch at [1], and didn’t consider that the browser 
rendering likely doesn’t match the actual file content. Sorry for the noise. 
And good to hear the review process is underway.

Kind regards,
Anthony

[1] https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-April/059783.html


From: Raffaello Giulietti 
Sent: Monday, June 17, 2019 4:56:22 PM
To: Anthony Vanelverdinghe; core-libs-dev
Subject: Re: [PATCH] 4511638: Double.toString(double) sometimes produces 
incorrect results


Hi Anthony,

On 16/06/2019 19.17, Anthony Vanelverdinghe wrote:

Hi Raffaello



While I don't have feedback on the actual math, here's a few suggestions:

- there's some use of non-ASCII characters in the patch. I don't think this is 
common in the JDK's Java sources, so you might want to replace them with their 
Unicode escapes. The characters are: ≤ (\u2264), ∞ (\u221e), × (\u00d7), ≥ 
(\u2265), … (\u2026), ≠ (\u2260), ⌊ (\u230a), ⌋ (\u230b), · (\u00b7), β (\u03b2)

I'm not sure what you mean here: there are indeed usages of HTML entities like 
∞ that appear as math symbols in the rendered Javadoc but the sources are 
100% US-ASCII in my IDE and should be so even in the patch.

If not, I'd be interested in the "coordinates" of the characters outside of the 
US-ASCII set.


- there are 2 invocations of a deprecated String constructor for performance 
reasons. I don't know how big the performance difference is, but I would 
suggest replacing these invocations with `new String(bytes, 
StandardCharsets.US_ASCII)` instead, and filing a bug for the performance 
difference with the deprecated constructor

The perf diff is measurable. The chosen variant is the fastest that I could 
come up with. As long as the deprecated constructor does not become "forRemoval 
= true", if at all, I wouldn't worry. Even then, there's plenty of time to 
switch.

Until then I can try to understand why the used constructor is faster and 
perhaps file a bug as you suggest.


- there are 2 occurrences of a typo "left-to-tight"

Yep, thanks for your eagle's eyes!




Other than that, I can only say this is an impressive piece of work, so I hope 
some official Reviewers will help you add your contribution to the JDK.

The review process has indeed begun some days ago. However, since there's is 
quite some maths to digest to understand the code, a full review could take 1 
to 2 days of concentrated reading: not the average review, I guess.

Plus, there's a change in the specification, which must undergo a further 
review.


Thanks for your time!

Greetings
Raffaello





Kind regards

Anthony




From: core-libs-dev 
<mailto:core-libs-dev-boun...@openjdk.java.net>
 on behalf of Raffaello Giulietti 
<mailto:raffaello.giulie...@gmail.com>
Sent: Monday, May 6, 2019 2:08:48 PM
To: core-libs-dev
Subject: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect 
results

Hi,

no new code this time, only a warm invitation to review the rather
substantial patch submitted on 2019-04-18 [1] and the CSR [2].

I spent an insane amount of (unpaid) free time offering my contribution
to solve a bug known since about 15 years. Thus, I think it is
understandable that I'd like to see my efforts come to a successful
conclusion, ideally for OpenJDK 13.


Greetings
Raffaello


P.S.
Some enjoyable properties of the novel algorithm:
* No objects are instantiated, except, of course, the resulting String.
* Loop-free core algorithm.
* Only int and long arithmetic.
* No divisions at all.
* 16x speedup (jmh).
* Randomized, yet reproducible deep diving tests (jtreg).
* Clear, unambiguous spec.
* All floats have been tested to fully meet the spec.
* Fully documented in [3] or in comments.



[1]
https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-April/059783.html
[2] https://bugs.openjdk.java.net/browse/JDK-8202555
[3] https://drive.google.com/open?id=1KLtG_LaIbK9ETXI290zqCxvBW94dj058

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-06-17 Thread Raffaello Giulietti


Hi Anthony,

On 16/06/2019 19.17, Anthony Vanelverdinghe wrote:


Hi Raffaello

While I don't have feedback on the actual math, here's a few suggestions:

- there's some use of non-ASCII characters in the patch. I don't think 
this is common in the JDK's Java sources, so you might want to replace 
them with their Unicode escapes. The characters are: ≤ (\u2264), ∞ 
(\u221e), × (\u00d7), ≥ (\u2265), … (\u2026), ≠ (\u2260), ⌊(\u230a), 
⌋(\u230b), · (\u00b7), β (\u03b2)


I'm not sure what you mean here: there are indeed usages of HTML 
entities like ∞ that appear as math symbols in the rendered 
Javadoc but the sources are 100% US-ASCII in my IDE and should be so 
even in the patch.


If not, I'd be interested in the "coordinates" of the characters outside 
of the US-ASCII set.



- there are 2 invocations of a deprecated String constructor for 
performance reasons. I don't know how big the performance difference 
is, but I would suggest replacing these invocations with `new 
String(bytes, StandardCharsets.US_ASCII)` instead, and filing a bug 
for the performance difference with the deprecated constructor


The perf diff is measurable. The chosen variant is the fastest that I 
could come up with. As long as the deprecated constructor does not 
become "forRemoval = true", if at all, I wouldn't worry. Even then, 
there's plenty of time to switch.


Until then I can try to understand why the used constructor is faster 
and perhaps file a bug as you suggest.




- there are 2 occurrences of a typo "left-to-tight"


Yep, thanks for your eagle's eyes!


Other than that, I can only say this is an impressive piece of work, 
so I hope some official Reviewers will help you add your contribution 
to the JDK.


The review process has indeed begun some days ago. However, since 
there's is quite some maths to digest to understand the code, a full 
review could take 1 to 2 days of concentrated reading: not the average 
review, I guess.


Plus, there's a change in the specification, which must undergo a 
further review.



Thanks for your time!

Greetings
Raffaello




Kind regards

Anthony


*From:* core-libs-dev  on 
behalf of Raffaello Giulietti 

*Sent:* Monday, May 6, 2019 2:08:48 PM
*To:* core-libs-dev
*Subject:* [PATCH] 4511638: Double.toString(double) sometimes produces 
incorrect results

Hi,

no new code this time, only a warm invitation to review the rather
substantial patch submitted on 2019-04-18 [1] and the CSR [2].

I spent an insane amount of (unpaid) free time offering my contribution
to solve a bug known since about 15 years. Thus, I think it is
understandable that I'd like to see my efforts come to a successful
conclusion, ideally for OpenJDK 13.


Greetings
Raffaello


P.S.
Some enjoyable properties of the novel algorithm:
* No objects are instantiated, except, of course, the resulting String.
* Loop-free core algorithm.
* Only int and long arithmetic.
* No divisions at all.
* 16x speedup (jmh).
* Randomized, yet reproducible deep diving tests (jtreg).
* Clear, unambiguous spec.
* All floats have been tested to fully meet the spec.
* Fully documented in [3] or in comments.



[1]
https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-April/059783.html
[2] https://bugs.openjdk.java.net/browse/JDK-8202555
[3] https://drive.google.com/open?id=1KLtG_LaIbK9ETXI290zqCxvBW94dj058

RE: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-06-16 Thread Anthony Vanelverdinghe

Hi Raffaello



While I don't have feedback on the actual math, here's a few suggestions:

- there's some use of non-ASCII characters in the patch. I don't think this is 
common in the JDK's Java sources, so you might want to replace them with their 
Unicode escapes. The characters are: ≤ (\u2264), ∞ (\u221e), × (\u00d7), ≥ 
(\u2265), … (\u2026), ≠ (\u2260), ⌊ (\u230a), ⌋ (\u230b), · (\u00b7), β (\u03b2)

- there are 2 invocations of a deprecated String constructor for performance 
reasons. I don't know how big the performance difference is, but I would 
suggest replacing these invocations with `new String(bytes, 
StandardCharsets.US_ASCII)` instead, and filing a bug for the performance 
difference with the deprecated constructor

- there are 2 occurrences of a typo "left-to-tight"



Other than that, I can only say this is an impressive piece of work, so I hope 
some official Reviewers will help you add your contribution to the JDK.



Kind regards

Anthony




From: core-libs-dev  on behalf of 
Raffaello Giulietti 
Sent: Monday, May 6, 2019 2:08:48 PM
To: core-libs-dev
Subject: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect 
results

Hi,

no new code this time, only a warm invitation to review the rather
substantial patch submitted on 2019-04-18 [1] and the CSR [2].

I spent an insane amount of (unpaid) free time offering my contribution
to solve a bug known since about 15 years. Thus, I think it is
understandable that I'd like to see my efforts come to a successful
conclusion, ideally for OpenJDK 13.


Greetings
Raffaello


P.S.
Some enjoyable properties of the novel algorithm:
* No objects are instantiated, except, of course, the resulting String.
* Loop-free core algorithm.
* Only int and long arithmetic.
* No divisions at all.
* 16x speedup (jmh).
* Randomized, yet reproducible deep diving tests (jtreg).
* Clear, unambiguous spec.
* All floats have been tested to fully meet the spec.
* Fully documented in [3] or in comments.



[1]
https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-April/059783.html
[2] https://bugs.openjdk.java.net/browse/JDK-8202555
[3] https://drive.google.com/open?id=1KLtG_LaIbK9ETXI290zqCxvBW94dj058

[PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-05-06 Thread Raffaello Giulietti


Hi,

no new code this time, only a warm invitation to review the rather 
substantial patch submitted on 2019-04-18 [1] and the CSR [2].


I spent an insane amount of (unpaid) free time offering my contribution 
to solve a bug known since about 15 years. Thus, I think it is 
understandable that I'd like to see my efforts come to a successful 
conclusion, ideally for OpenJDK 13.



Greetings
Raffaello


P.S.
Some enjoyable properties of the novel algorithm:
* No objects are instantiated, except, of course, the resulting String.
* Loop-free core algorithm.
* Only int and long arithmetic.
* No divisions at all.
* 16x speedup (jmh).
* Randomized, yet reproducible deep diving tests (jtreg).
* Clear, unambiguous spec.
* All floats have been tested to fully meet the spec.
* Fully documented in [3] or in comments.



[1] 
https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-April/059783.html

[2] https://bugs.openjdk.java.net/browse/JDK-8202555
[3] https://drive.google.com/open?id=1KLtG_LaIbK9ETXI290zqCxvBW94dj058

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-04-19 Thread Raffaello Giulietti


Hi,

the semantics of java.io.IOError is:
"Thrown when a serious I/O error has occurred"
which I guess is not appropriate here.


I think the best compromise is

880 try {
881 FloatToDecimal.appendTo(f, this);
882 } catch (IOException cause) {
883 throw new AssertionError("Code shall be unreachable", cause);
884 }

which is more explicit and doesn't rely on assertions being enabled.


Greetings
Raffaello



On 2019-04-19 19:21, Jason Mehrens wrote:

Maybe rename the catch variable from 'cause' or 'ignored' to 'unreachable' and 
then wrap java.io.IOException in java.io.IOError?


From: core-libs-dev  on behalf of Raffaello 
Giulietti 
Sent: Thursday, April 18, 2019 3:37 PM
To: Brian Burkhalter; core-libs-dev
Subject: Re: [PATCH] 4511638: Double.toString(double) sometimes produces 
incorrect results

Hi,

On 18.04.19 21:29, Brian Burkhalter wrote:



On Apr 18, 2019, at 11:44 AM, Raffaello Giulietti
mailto:raffaello.giulie...@gmail.com>> wrote:

here's another revision of the patch. Its purpose is to overcome the
test failure observed in [1]. To this end, the patch adds

FloatToDecimal.appendTo(float, Appendable) and
DoubleToDecimal.appendTo(double, Appendable)

static methods to help AbstractStringBuilder in using the corrected
algorithm implemented in

FloatToDecimal.toString(float) and
DoubleToDecimal.toString(double), respectively.

The implementation has been jmh tested to make sure there are no
performance regressions.


Thanks, Raffaello.


As there are now only less than two months left before Rampdown 1 for
OpenJDK 13, I beg anybody interested in reviewing this patch to
contact me for any question or clarification. Also, you might want to
take a look at the CSR [2].


+1 on both counts.


As usual, Brian will make the patch available as webrev in the coming
hours.


Please see

http://cr.openjdk.java.net/~bpb/4511638/webrev.03/

I wonder whether in the new AbstractStringBuilder.append() changes the
constructs:
880 try {
881 FloatToDecimal.appendTo(f, this);
882 } catch (IOException ignored) {
883 assert false;
884 }
might be better as:
880 try {
881 FloatToDecimal.appendTo(f, this);
882 } catch (IOException cause) {
883 throw new RuntimeException(cause);
884 }
Comments appreciated.

Brian



The reason is that FloatToDecimal.appendTo() takes an Appendable as
parameter. Appendable's append() methods throw a checked IOException.
This cannot happen with AbstractStringBuilder (the this passed as
argument), so the code catches the IOException just to please the
compiler. The assert is there just in case, but control shall never pass
there. Whatever conventions are in place to emphasize that control flow
cannot reach some point, I'll adopt them. Let me know.

The same holds for DoubleToDecimal.appendTo().

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-04-19 Thread Jason Mehrens

Maybe rename the catch variable from 'cause' or 'ignored' to 'unreachable' and 
then wrap java.io.IOException in java.io.IOError?


From: core-libs-dev  on behalf of 
Raffaello Giulietti 
Sent: Thursday, April 18, 2019 3:37 PM
To: Brian Burkhalter; core-libs-dev
Subject: Re: [PATCH] 4511638: Double.toString(double) sometimes produces 
incorrect results

Hi,

On 18.04.19 21:29, Brian Burkhalter wrote:
>
>> On Apr 18, 2019, at 11:44 AM, Raffaello Giulietti
>> > <mailto:raffaello.giulie...@gmail.com>> wrote:
>>
>> here's another revision of the patch. Its purpose is to overcome the
>> test failure observed in [1]. To this end, the patch adds
>>
>> FloatToDecimal.appendTo(float, Appendable) and
>> DoubleToDecimal.appendTo(double, Appendable)
>>
>> static methods to help AbstractStringBuilder in using the corrected
>> algorithm implemented in
>>
>> FloatToDecimal.toString(float) and
>> DoubleToDecimal.toString(double), respectively.
>>
>> The implementation has been jmh tested to make sure there are no
>> performance regressions.
>
> Thanks, Raffaello.
>
>> As there are now only less than two months left before Rampdown 1 for
>> OpenJDK 13, I beg anybody interested in reviewing this patch to
>> contact me for any question or clarification. Also, you might want to
>> take a look at the CSR [2].
>
> +1 on both counts.
>
>> As usual, Brian will make the patch available as webrev in the coming
>> hours.
>
> Please see
>
> http://cr.openjdk.java.net/~bpb/4511638/webrev.03/
>
> I wonder whether in the new AbstractStringBuilder.append() changes the
> constructs:
> 880 try {
> 881 FloatToDecimal.appendTo(f, this);
> 882 } catch (IOException ignored) {
> 883 assert false;
> 884 }
> might be better as:
> 880 try {
> 881 FloatToDecimal.appendTo(f, this);
> 882 } catch (IOException cause) {
> 883 throw new RuntimeException(cause);
> 884 }
> Comments appreciated.
>
> Brian


The reason is that FloatToDecimal.appendTo() takes an Appendable as
parameter. Appendable's append() methods throw a checked IOException.
This cannot happen with AbstractStringBuilder (the this passed as
argument), so the code catches the IOException just to please the
compiler. The assert is there just in case, but control shall never pass
there. Whatever conventions are in place to emphasize that control flow
cannot reach some point, I'll adopt them. Let me know.

The same holds for DoubleToDecimal.appendTo().

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-04-18 Thread Raffaello Giulietti


Hi,

On 18.04.19 21:29, Brian Burkhalter wrote:


On Apr 18, 2019, at 11:44 AM, Raffaello Giulietti 
> wrote:


here's another revision of the patch. Its purpose is to overcome the 
test failure observed in [1]. To this end, the patch adds


FloatToDecimal.appendTo(float, Appendable) and
DoubleToDecimal.appendTo(double, Appendable)

static methods to help AbstractStringBuilder in using the corrected 
algorithm implemented in


FloatToDecimal.toString(float) and
DoubleToDecimal.toString(double), respectively.

The implementation has been jmh tested to make sure there are no 
performance regressions.


Thanks, Raffaello.

As there are now only less than two months left before Rampdown 1 for 
OpenJDK 13, I beg anybody interested in reviewing this patch to 
contact me for any question or clarification. Also, you might want to 
take a look at the CSR [2].


+1 on both counts.

As usual, Brian will make the patch available as webrev in the coming 
hours.


Please see

http://cr.openjdk.java.net/~bpb/4511638/webrev.03/

I wonder whether in the new AbstractStringBuilder.append() changes the 
constructs:

880 try {
881 FloatToDecimal.appendTo(f, this);
882 } catch (IOException ignored) {
883 assert false;
884 }
might be better as:
880 try {
881 FloatToDecimal.appendTo(f, this);
882 } catch (IOException cause) {
883 throw new RuntimeException(cause);
884 }
Comments appreciated.

Brian



The reason is that FloatToDecimal.appendTo() takes an Appendable as 
parameter. Appendable's append() methods throw a checked IOException. 
This cannot happen with AbstractStringBuilder (the this passed as 
argument), so the code catches the IOException just to please the 
compiler. The assert is there just in case, but control shall never pass 
there. Whatever conventions are in place to emphasize that control flow 
cannot reach some point, I'll adopt them. Let me know.


The same holds for DoubleToDecimal.appendTo().

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-04-18 Thread Brian Burkhalter



> On Apr 18, 2019, at 11:44 AM, Raffaello Giulietti 
>  wrote:
> 
> here's another revision of the patch. Its purpose is to overcome the test 
> failure observed in [1]. To this end, the patch adds
> 
> FloatToDecimal.appendTo(float, Appendable) and
> DoubleToDecimal.appendTo(double, Appendable)
> 
> static methods to help AbstractStringBuilder in using the corrected algorithm 
> implemented in
> 
> FloatToDecimal.toString(float) and
> DoubleToDecimal.toString(double), respectively.
> 
> The implementation has been jmh tested to make sure there are no performance 
> regressions.

Thanks, Raffaello.

> As there are now only less than two months left before Rampdown 1 for OpenJDK 
> 13, I beg anybody interested in reviewing this patch to contact me for any 
> question or clarification. Also, you might want to take a look at the CSR [2].

+1 on both counts.

> As usual, Brian will make the patch available as webrev in the coming hours.

Please see

http://cr.openjdk.java.net/~bpb/4511638/webrev.03/

I wonder whether in the new AbstractStringBuilder.append() changes the 
constructs:
 880 try {
 881 FloatToDecimal.appendTo(f, this);
 882 } catch (IOException ignored) {
 883 assert false;
 884 }
might be better as:
 880 try {
 881 FloatToDecimal.appendTo(f, this);
 882 } catch (IOException cause) {
 883 throw new RuntimeException(cause);
 884 }
Comments appreciated.

Brian

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-04-03 Thread Brian Burkhalter

Hi Raffaello, et. al.

> On Apr 3, 2019, at 7:43 AM, Raffaello Giulietti 
>  wrote:
> 
> in this patch there are additional white-box tests against critical package 
> private entities in the jdk.internal.math classes. In particular the class 
> jdk.internal.math.MathUtils is thoroughly tested.
> 
> Also, the tag @key randomness is now used along jdk.test.lib.RandomFactory.

Good.

> To run the tests
> make test TEST="jtreg:test/jdk/jdk/internal/math/ToDecimal”

As a note to anyone else running these, I configured my OpenJDK build as

bash ./configure --with-jtreg=$JTREG_HOME

where $JTREG_HOME is the location of the unpacked jtreg archive. The tests 
passed, as expected:

==
Test summary
==
   TEST  TOTAL  PASS  FAIL ERROR   
   jtreg:test/jdk/jdk/internal/math/ToDecimal3 3 0 0   
==
TEST SUCCESS

> Brian Burkhalter, my sponsor, will soon publish the patch in webrev form.

Done [1].

> Of course, if anybody would like to step forward as a reviewer I'll be glad 
> to help.

Any reviews would of course be most welcome, both of the patch itself, and of 
the CSR [2], provided for the latter one has a JBS login.

Thanks,

Brian

[1] http://cr.openjdk.java.net/~bpb/4511638/webrev.02/
[2] https://bugs.openjdk.java.net/browse/JDK-8202555

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-03-14 Thread Brian Burkhalter

The CSR for this issue is available for review at 
https://bugs.openjdk.java.net/browse/JDK-8202555 
. If you have a JBS user name 
you can add yourself as reviewer by editing the issue directly, assuming you 
concur with the content.

Brian

> On Mar 8, 2019, at 8:56 AM, raffaello.giulie...@gmail.com wrote:
> 
> On 2019-03-08 14:35, Andrew Haley wrote:
>> Hi,
>> 
>> On 3/7/19 7:16 PM, Raffaello Giulietti wrote:
>> 
>>> a couple of weeks ago I tried to refactor the code assuming the 
>>> existence of unsignedMultiplyHigh() (either as some future intrinsic or 
>>> as a Java method) and a wider representations of g with either 127 or 
>>> 128 bits:
>>> g = g1 2^64 + g0
>>> 
>>> with either
>>> 2^63 <= g1 < 2^64 (128 bits)
>>> 
>>> or
>>> 2^62 <= g1 < 2^63 (127 bits)
>>> 
>>> Unfortunately, the resulting code of rop() isn't any simpler. That's 
>>> because then an intermediate sum can overflow the 64 bits of a long. As 
>>> a consequence, there's need for more elaborate logic to determine
>>> the carry and other slightly more complicated computations to assemble
>>> the final result. All in all, the resulting code has more operations and 
>>> looks longer.
>> 
>> Ah, I see. I agree, we still don't quite have the full set of operations
>> that we need in Java, in particular a nice way of doing an add with carry.
>> 
> 
> Yes.
> 
> 
>> Thank you for the explanation.
>> 
> 
> You're welcome.
> 
> 
>>> In the meantime I got rid of the last division. There's no division at 
>>> all in the whole algorithm.
>> 
>> Excellent. This is looking very good indeed.
>> 
>

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-03-08 Thread raffaello . giulietti

On 2019-03-08 14:35, Andrew Haley wrote:
> Hi,
> 
> On 3/7/19 7:16 PM, Raffaello Giulietti wrote:
> 
>> a couple of weeks ago I tried to refactor the code assuming the 
>> existence of unsignedMultiplyHigh() (either as some future intrinsic or 
>> as a Java method) and a wider representations of g with either 127 or 
>> 128 bits:
>>  g = g1 2^64 + g0
>>
>> with either
>>  2^63 <= g1 < 2^64 (128 bits)
>>
>> or
>>  2^62 <= g1 < 2^63 (127 bits)
>>
>> Unfortunately, the resulting code of rop() isn't any simpler. That's 
>> because then an intermediate sum can overflow the 64 bits of a long. As 
>> a consequence, there's need for more elaborate logic to determine
>> the carry and other slightly more complicated computations to assemble
>> the final result. All in all, the resulting code has more operations and 
>> looks longer.
> 
> Ah, I see. I agree, we still don't quite have the full set of operations
> that we need in Java, in particular a nice way of doing an add with carry.
> 

Yes.


> Thank you for the explanation.
> 

You're welcome.


>> In the meantime I got rid of the last division. There's no division at 
>> all in the whole algorithm.
> 
> Excellent. This is looking very good indeed.
>

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-03-08 Thread Andrew Haley

Hi,

On 3/7/19 7:16 PM, Raffaello Giulietti wrote:

> a couple of weeks ago I tried to refactor the code assuming the 
> existence of unsignedMultiplyHigh() (either as some future intrinsic or 
> as a Java method) and a wider representations of g with either 127 or 
> 128 bits:
>  g = g1 2^64 + g0
> 
> with either
>  2^63 <= g1 < 2^64 (128 bits)
> 
> or
>  2^62 <= g1 < 2^63 (127 bits)
> 
> Unfortunately, the resulting code of rop() isn't any simpler. That's 
> because then an intermediate sum can overflow the 64 bits of a long. As 
> a consequence, there's need for more elaborate logic to determine
> the carry and other slightly more complicated computations to assemble
> the final result. All in all, the resulting code has more operations and 
> looks longer.

Ah, I see. I agree, we still don't quite have the full set of operations
that we need in Java, in particular a nice way of doing an add with carry.

Thank you for the explanation.

> In the meantime I got rid of the last division. There's no division at 
> all in the whole algorithm.

Excellent. This is looking very good indeed.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. 
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-03-08 Thread Brian Burkhalter



> On Mar 7, 2019, at 6:01 PM, Andrew Haley  wrote:
> 
> On 3/7/19 10:10 AM, Brian Burkhalter wrote:
>> 
>>> On Mar 7, 2019, at 10:04 AM, Andrew Haley >> > wrote:
>>> 
>>> I still believe you'd be better off defining an unsigned multiplyHigh than
>>> all that messing about with shifts and masks in rop().
>> 
>> There is in fact an open issue for unsigned multiplyHigh:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8188044 
>> 
> 
> Do you want to do it? If not I will.

Your contribution would be most welcome.

Thanks,

Brian

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-03-07 Thread Raffaello Giulietti


On 2019-03-07 11:10, Brian Burkhalter wrote:


On Mar 7, 2019, at 10:04 AM, Andrew Haley > wrote:


On 3/6/19 7:31 PM,raffaello.giulie...@gmail.com 
wrote:

the latest version of the patch, replacing the one found at [1].
In the next days, my sponsor Brian Burkhalter will publish it as a 
webrev.


I still believe you'd be better off defining an unsigned multiplyHigh than
all that messing about with shifts and masks in rop().


There is in fact an open issue for unsigned multiplyHigh:

https://bugs.openjdk.java.net/browse/JDK-8188044

Brian




Hi Andrew,

a couple of weeks ago I tried to refactor the code assuming the 
existence of unsignedMultiplyHigh() (either as some future intrinsic or 
as a Java method) and a wider representations of g with either 127 or 
128 bits:

g = g1 2^64 + g0

with either
2^63 <= g1 < 2^64 (128 bits)

or
2^62 <= g1 < 2^63 (127 bits)

Unfortunately, the resulting code of rop() isn't any simpler. That's 
because then an intermediate sum can overflow the 64 bits of a long. As 
a consequence, there's need for more elaborate logic to determine

the carry and other slightly more complicated computations to assemble
the final result. All in all, the resulting code has more operations and 
looks longer.


I tried with four variants. In addition to the mults, which are needed 
anyway, the current code has 3 shifts, 3 adds, 2 bitwise logicals. As 
mentioned, I couldn't come up with a solution that would help reducing 
this count.


I would be glad to hear of better solutions and to write down a 
mathematical proof for the document.




In the meantime I got rid of the last division. There's no division at 
all in the whole algorithm.




Greetings
Raffaello

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-03-07 Thread Andrew Haley

On 3/7/19 10:10 AM, Brian Burkhalter wrote:
> 
>> On Mar 7, 2019, at 10:04 AM, Andrew Haley  wrote:
>>
>> I still believe you'd be better off defining an unsigned multiplyHigh than
>> all that messing about with shifts and masks in rop().
> 
> There is in fact an open issue for unsigned multiplyHigh:
> 
> https://bugs.openjdk.java.net/browse/JDK-8188044

Do you want to do it? If not I will.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. 
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-03-07 Thread Brian Burkhalter



> On Mar 7, 2019, at 10:04 AM, Andrew Haley  wrote:
> 
> On 3/6/19 7:31 PM, raffaello.giulie...@gmail.com 
>  wrote:
>> the latest version of the patch, replacing the one found at [1].
>> In the next days, my sponsor Brian Burkhalter will publish it as a webrev.
> 
> I still believe you'd be better off defining an unsigned multiplyHigh than
> all that messing about with shifts and masks in rop().

There is in fact an open issue for unsigned multiplyHigh:

https://bugs.openjdk.java.net/browse/JDK-8188044

Brian

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-03-07 Thread Andrew Haley

On 3/6/19 7:31 PM, raffaello.giulie...@gmail.com wrote:
> the latest version of the patch, replacing the one found at [1].
> In the next days, my sponsor Brian Burkhalter will publish it as a webrev.

I still believe you'd be better off defining an unsigned multiplyHigh than
all that messing about with shifts and masks in rop().

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. 
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2019-03-07 Thread Brian Burkhalter

The corresponding webrev may be viewed at [1].

Thanks,

Brian

[1] http://cr.openjdk.java.net/~bpb/4511638/webrev.01/ 


> On Mar 6, 2019, at 7:31 PM, raffaello.giulie...@gmail.com 
>  wrote:
> 
> the latest version of the patch, replacing the one found at [1].
> In the next days, my sponsor Brian Burkhalter will publish it as a webrev.

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-10-24 Thread Brian Burkhalter

Hello,

I think I neglected to post the link here before, but the Compatibility and 
Specification Request (CSR) corresponding to this issue is at [2]. It has a 
specdiff attached to it which would be helpful in comparing the changes to the 
javadoc specification of {Double,Float}.toString(). Any comments or reviews 
from interested parties would be appreciated.

Thanks,

Brian

[1] https://wiki.openjdk.java.net/display/csr
[2] https://bugs.openjdk.java.net/browse/JDK-8202555

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-10-04 Thread raffaello . giulietti

On 2018-10-04 15:28, Ulf Adams wrote:
> On Thu, Sep 27, 2018 at 5:37 PM Raffaello Giulietti
> mailto:raffaello.giulie...@gmail.com>>
> wrote:
> 
> Hi Ulf,
> 
> 
> On 2018-09-27 16:40, Ulf Adams wrote:
> > Hi Raffaello,
> >
> > I am the author of a recent publication on double to string
> conversion
> > [1] - the Ryu algorithm. I've been aware of the problems with the Jdk
> > for several years, and am very much looking forward to
> improvements in
> > correctness and performance in this area.
> >
> 
> What a coincidence! I'm happy to hear that the quest for better
> floating->string conversions has not stopped. Tomorrow I'll download
> your paper and have a look at it during the weekend.
> 
> 
> Have you had a chance to take a look?
> 
> (I'm traveling for the next ~10 days and at a conference, so don't
> expect too much from me during that time.)
>  
> 

I had a cursory reading but couldn't dig deeper for now. If nothing
unexpected happens, I should be able to study your paper during the weekend.



> 
> 
> 
> > I have done some testing against my Java implementation of the Ryu
> > algorithm described in the linked paper. Interestingly, I've found
> a few
> > cases where they output different results. In particular:
> > 1.0E-323 is printed as 9.9E-324
> > 1.0E-322 is printed as 9.9E-323
> 
> If Ryu also produces 1 digit long outputs, then your results above are
> correct. But then Ryu should also output 5.0E-324 rather than 4.9E-324,
> for example.
> Even better, it should output 5E-324, 1E-323 and 1E-322 because adding
> the .0 part might confuse a human reader to believe that 2 digits are
> really needed. But then 4.9E-324, 9.9E-324 and 9.9E-323 are closer to
> the double.
> 
> 
> The C version produces 1 digit long outputs, and I was trying to follow
> the Java spec in the Java version, but the code to do so isn't quite
> right. Unfortunately, I haven't yet been able to fix it.
>  
> 
> 
> 2 digits are for backward compatibility with the existing spec which
> requires at least one digit to the right of the decimal point.
> 
> 
> >
> > It's likely that there are more such cases - I only ran a sample of
> > double-precision numbers. Arguably, 9.9 is the correctly rounded
> 2-digit
> > output and Ryu is incorrect here. That's what you get when you have a
> > special case for Java without a correctness proof. :-(
> >
> > In terms of performance, this algorithm performs almost exactly
> the same
> > as my Java implementation of Ryu, although I'd like to point out
> that my
> > C implementation of Ryu is quite a bit faster (though note that it
> > generates different output, in particular, it only outputs a single
> > digit of precision in the above cases, rather than two), and I didn't
> > backport all the performance improvements from the Java version,
> yet. It
> > looks like this is not coincidence - as far as I can see so far, it's
> > algorithmically very similar, although it manages to avoid the
> loop I'm
> > using in Ryu to find the shortest representation.
> >
> > I have a few comments:
> >
> >       *      It rounds to {@code v} according to the usual
> > round-to-closest
> >       *     rule of IEEE 754 floating-point arithmetic.
> > - Since you're spelling out the rounding rules just below, this is
> > duplicated, and by itself, it's unclear since it doesn't specify the
> > specific sub-type (round half even).
> >
> 
> I tried to save as much of the original spec wording as possible.
> Perhaps it isn't worthwhile.
> 
> 
> 
> > - Naming: I'd strongly suggest to use variable names that relate to
> > what's stored, e.g., m for mantissa, e for exponent, etc.
> >
> 
> I currently prefer to be consistent with a forthcoming paper of mine on
> the subject. But thanks for the suggestion.
> 
> 
> May I suggest that the paper also uses names that relate to what they're
> referring to? :-) Not that I've managed to do that very well myself...
> 

I tend to use short "mathematical" names that still evoke their
semantics. Will see if I manage to be consistent.



> 
> 
> 
> > - What's not clear to me is how the algorithm determines how many
> digits
> > to print.
> >
> 
> You'll have to wait for the paper.
> 
> 
> Looking forward to it. I tried to reverse engineer the code, but it's
> far from obvious.
>  

I don't think it is reversible, even for knowledgeable people like you :-(

I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...



> 
> 
> 
> > - Also, it might b

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-10-04 Thread raffaello . giulietti

On 2018-10-03 11:12, Andrew Dinn wrote:
> On 02/10/18 19:17, Martin Buchholz wrote:
>> Raffaello and Ulf should talk.
>> It seems like Ulf's ryu project is trying to solve the same problem.
>> ryu is also still being worked on, but there is already a published paper
>> and ryu is being adopted by various core libraries.
>> https://github.com/ulfjack/ryu
>> https://dl.acm.org/citation.cfm?id=3192369
>> Ulf, if you haven't already signed an Oracle Contributor Agreement for
>> openjdk, you should do so.
>> (Who knew printing floating point numbers could be so hard?)
> 
> Well, Guy Steele and Jon L White, for starters (quite literally) [1]
> 

In my eyes, the clarity of that paper is almost mythical.
I wish I could follow their teaching myself.





> [1] https://dl.acm.org/citation.cfm?id=93559
> 
> regards,
> 
> 
> Andrew Dinn
> ---
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
>

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-10-04 Thread raffaello . giulietti

Hi,

On 2018-10-02 20:17, Martin Buchholz wrote:
> Raffaello and Ulf should talk.

I work on this project in my spare time, which is limited and not
entirely devoted to programming for the OpenJDK, of course ;-)

Starting this weekend, I'll try to recollect my notes and organize them
in paper or tutorial form.

I cannot speak for Ulf, of course, but my understanding is that his
porting to Java is a spare-time project, too.

> It seems like Ulf's ryu project is trying to solve the same problem.
> ryu is also still being worked on, but there is already a published
> paper and ryu is being adopted by various core libraries.
> https://github.com/ulfjack/ryu
> https://dl.acm.org/citation.cfm?id=3192369
> Ulf, if you haven't already signed an Oracle Contributor Agreement for
> openjdk, you should do so.
> (Who knew printing floating point numbers could be so hard?)
> 

Well, printing floating point numbers is simple if there are no
performance concerns. Abstractly, it is only a matter of calculations
with rational numbers. It becomes harder when efficiency becomes part of
the goal.

The quest started about 30 years ago with a seminal paper of Steele &
White and seemed settled with an implementation by Gay around 1991.
Since then, other interesting approaches have seen light with the goal
of even better performance.

Greetings
Raffaello

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-10-04 Thread Ulf Adams

On Thu, Sep 27, 2018 at 5:37 PM Raffaello Giulietti <
raffaello.giulie...@gmail.com> wrote:

> Hi Ulf,
>
>
> On 2018-09-27 16:40, Ulf Adams wrote:
> > Hi Raffaello,
> >
> > I am the author of a recent publication on double to string conversion
> > [1] - the Ryu algorithm. I've been aware of the problems with the Jdk
> > for several years, and am very much looking forward to improvements in
> > correctness and performance in this area.
> >
>
> What a coincidence! I'm happy to hear that the quest for better
> floating->string conversions has not stopped. Tomorrow I'll download
> your paper and have a look at it during the weekend.
>

Have you had a chance to take a look?

(I'm traveling for the next ~10 days and at a conference, so don't expect
too much from me during that time.)


>
>
>
> > I have done some testing against my Java implementation of the Ryu
> > algorithm described in the linked paper. Interestingly, I've found a few
> > cases where they output different results. In particular:
> > 1.0E-323 is printed as 9.9E-324
> > 1.0E-322 is printed as 9.9E-323
>
> If Ryu also produces 1 digit long outputs, then your results above are
> correct. But then Ryu should also output 5.0E-324 rather than 4.9E-324,
> for example.
> Even better, it should output 5E-324, 1E-323 and 1E-322 because adding
> the .0 part might confuse a human reader to believe that 2 digits are
> really needed. But then 4.9E-324, 9.9E-324 and 9.9E-323 are closer to
> the double.
>

The C version produces 1 digit long outputs, and I was trying to follow the
Java spec in the Java version, but the code to do so isn't quite right.
Unfortunately, I haven't yet been able to fix it.


>
> 2 digits are for backward compatibility with the existing spec which
> requires at least one digit to the right of the decimal point.
>
>
> >
> > It's likely that there are more such cases - I only ran a sample of
> > double-precision numbers. Arguably, 9.9 is the correctly rounded 2-digit
> > output and Ryu is incorrect here. That's what you get when you have a
> > special case for Java without a correctness proof. :-(
> >
> > In terms of performance, this algorithm performs almost exactly the same
> > as my Java implementation of Ryu, although I'd like to point out that my
> > C implementation of Ryu is quite a bit faster (though note that it
> > generates different output, in particular, it only outputs a single
> > digit of precision in the above cases, rather than two), and I didn't
> > backport all the performance improvements from the Java version, yet. It
> > looks like this is not coincidence - as far as I can see so far, it's
> > algorithmically very similar, although it manages to avoid the loop I'm
> > using in Ryu to find the shortest representation.
> >
> > I have a few comments:
> >
> >   *  It rounds to {@code v} according to the usual
> > round-to-closest
> >   * rule of IEEE 754 floating-point arithmetic.
> > - Since you're spelling out the rounding rules just below, this is
> > duplicated, and by itself, it's unclear since it doesn't specify the
> > specific sub-type (round half even).
> >
>
> I tried to save as much of the original spec wording as possible.
> Perhaps it isn't worthwhile.
>
>
>
> > - Naming: I'd strongly suggest to use variable names that relate to
> > what's stored, e.g., m for mantissa, e for exponent, etc.
> >
>
> I currently prefer to be consistent with a forthcoming paper of mine on
> the subject. But thanks for the suggestion.
>

May I suggest that the paper also uses names that relate to what they're
referring to? :-) Not that I've managed to do that very well myself...


>
>
> > - What's not clear to me is how the algorithm determines how many digits
> > to print.
> >
>
> You'll have to wait for the paper.
>

Looking forward to it. I tried to reverse engineer the code, but it's far
from obvious.


>
>
> > - Also, it might be nicer to move the long multiplications to a helper
> > method - at least from a short look, it looks like the computations of
> > vn, vnl, and vnr are identical.
> >
>
> I tried several variants: the current one seems to be the faster with
> the current optimizations of C2. Some day I'll also try with Graal.
>

Sure, but moving it to a method shouldn't affect performance (except if you
need to return multiple values), and, right now, it looks like identical
code.


>
>
>
> > - I looked through the spec, and it looks like all cases are
> > well-defined. Yay!
> >
> > I will need some more time to do a more thorough review of the code and
> > more testing for differences. Unfortunately, I'm also traveling the next
> > two weeks, so this might take a bit of time.
> >
>
> I thank you in advance for your willingness to review the code but my
> understanding is that only the officially appointed reviewers can
> approve OpenJDK contributions, which is of course a good policy.
> Besides, as two Andrews engineers from RedHat correctly observe,
> understanding the rationale of the code wit

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-10-03 Thread Andrew Dinn

On 02/10/18 19:17, Martin Buchholz wrote:
> Raffaello and Ulf should talk.
> It seems like Ulf's ryu project is trying to solve the same problem.
> ryu is also still being worked on, but there is already a published paper
> and ryu is being adopted by various core libraries.
> https://github.com/ulfjack/ryu
> https://dl.acm.org/citation.cfm?id=3192369
> Ulf, if you haven't already signed an Oracle Contributor Agreement for
> openjdk, you should do so.
> (Who knew printing floating point numbers could be so hard?)

Well, Guy Steele and Jon L White, for starters (quite literally) [1]

[1] https://dl.acm.org/citation.cfm?id=93559

regards,


Andrew Dinn
---
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-10-02 Thread Martin Buchholz

Raffaello and Ulf should talk.
It seems like Ulf's ryu project is trying to solve the same problem.
ryu is also still being worked on, but there is already a published paper
and ryu is being adopted by various core libraries.
https://github.com/ulfjack/ryu
https://dl.acm.org/citation.cfm?id=3192369
Ulf, if you haven't already signed an Oracle Contributor Agreement for
openjdk, you should do so.
(Who knew printing floating point numbers could be so hard?)

On Wed, Sep 26, 2018 at 10:39 AM,  wrote:

> Hi,
>
> the current specification of Double.toString(double) is not precise
> enough to uniquely determine the resulting String. It leaves leeway for
> interpretation, thus potentially allowing implementations to return
> slightly different results, even from one release to the next [1].
>
> In addition, as the bug description points out, the current
> implementation does not even always meet its own specification, leading
> to results that are sometimes longer than needed.
>
> To address and overcome both issues, a new specification in the form of
> Javadoc associated to Double.toString(double) and Float.toString(float)
> is proposed, along with a clean-room re-implementation.
>
> The intent of the specification is to explicitly separate the
> determination of the unique decimal number to represent the floating
> point argument (double/float) from its formatting as a String.
>
> The clean-room implementation enjoys the following properties:
> * No objects at all are instantiated except for the resulting String and
> its backing storage.
> * Only arithmetic on longs/ints is performed.
> * The decimal selection algorithm performs at most one division on longs
> per conversion and is loop-free.
> * The digits extraction algorithm is totally division-free.
> * None of the anomalies described in the bug report have been observed.
> * The new Double.toString(double) speedup with respect to the current
> implementation, according to jmh, is better than 13x when measured over
> bitwise uniformly distributed random samples.
> * In the case of Float.toString(float), *all* results of the 2^32 float
> values have been extensively tested to meet the specification.
> * In the case of Double.toString(double), as it is infeasible to test
> all results, several billions of random doubles have been extensively
> tested.
>
> Since there's a change in the specification, according to my sponsor and
> the formalities it has to undergo a CSR.
>
>
>
> The submitted code contains both the changes to the current
> implementation and extensive jtreg tests.
>
>
>
> While I've struggled to keep the code within the 80 chars/line limit,
> mercurial still generates longer lines. Thus, to avoid possible problems
> with the email systems, the code is submitted both inline and as an
> attachment. Hope at least one copy makes its way without errors.
>
>
> Greetings
> Raffaello
>
>
> [1] https://bugs.openjdk.java.net/browse/JDK-4511638
>
>
>
> # HG changeset patch
> # User lello
> # Date 1537948169 -7200
> #  Wed Sep 26 09:49:29 2018 +0200
> # Node ID 6bd9d2c45440c578b93d2f07e5eaea73928be4d5
> # Parent  5f931e3e7a63b550d832d2624db32033b875c720
> Patches to fix JDK-4511638
>
> diff --git a/src/java.base/share/classes/java/lang/Double.java
> b/src/java.base/share/classes/java/lang/Double.java
> --- a/src/java.base/share/classes/java/lang/Double.java
> +++ b/src/java.base/share/classes/java/lang/Double.java
> @@ -25,6 +25,7 @@
>
>  package java.lang;
>
> +import jdk.internal.math.DoubleToDecimal;
>  import jdk.internal.math.FloatingDecimal;
>  import jdk.internal.math.DoubleConsts;
>  import jdk.internal.HotSpotIntrinsicCandidate;
> @@ -139,69 +140,108 @@
>  public static final Class   TYPE = (Class)
> Class.getPrimitiveClass("double");
>
>  /**
> - * Returns a string representation of the {@code double}
> - * argument. All characters mentioned below are ASCII characters.
> - * 
> - * If the argument is NaN, the result is the string
> - * "{@code NaN}".
> - * Otherwise, the result is a string that represents the sign and
> - * magnitude (absolute value) of the argument. If the sign is
> negative,
> - * the first character of the result is '{@code -}'
> - * ({@code '\u005Cu002D'}); if the sign is positive, no sign character
> - * appears in the result. As for the magnitude m:
> + * Returns a string rendering of the {@code double} argument.
> + *
> + * The characters of the result are all drawn from the ASCII set.
>   * 
> - * If m is infinity, it is represented by the characters
> - * {@code "Infinity"}; thus, positive infinity produces the result
> - * {@code "Infinity"} and negative infinity produces the result
> - * {@code "-Infinity"}.
> + *  Any NaN, whether quiet or signaling, is rendered
> symbolically
> + * as {@code "NaN"}, regardless of the sign bit.
> + *  The infinities +∞ and -∞ are rendered as
> + * {@code "Infinity"}

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-28 Thread Dmitry Nadezhin

I would like to tell about some steps towards formal check of Raffaello's
human proof.
I work on formal verification of some Oracle projects but they are
unrelated to JDK.
In April Brian Burkhalter requested me to review the Raffaello's paper.
His paper is smart and clear but nevertheless I was afraid to miss some
possible error in his it.
So I decided to check his proofs formally. I am familiar with a proof
checker ACL2 [1]. So I used it.
ACL2 is an abbreviation of "Applicative Common Lisp". It is both a language
( a subset of Common Lisp ) and
proof assistant for reasoning about functions in this language. It is
open-source and it is developed in UT at Austin.

This work in progress can be found in this GitHub repository [2].
ACL2 definitions and proofs followed human definitions and proofs as
possible.
Materials of N-th section of the paper are in a file acl2/sectionN.lisp .
Proofs from sections 2-10 were checked by ACL2 completely.
Also cases fullCaseXS and fullCaseXM from the sections 2-11 were checked by
ACL2.
Sorry, I didn't finish formal proof of the entire paper in anticipation of
a new algorithm.

Sometimes in July a paper by Michel Hack [3] prompted us that it is
possible to avoid multi-precision arithmetic.
File acl2/cont-fractions.lisp formalizes some of Michel's ideas in ACL2.
It considers a linear mapping alpha*d from range of naturals 0 <= d <=
max-d to rationals.
It defines an algorithms which for given alpha and "max-d" returns bound
"lo" and "hi"
how non-integer results of this mapping are frar from integer grid.
If approximate computations shows that alpha*d is very close to some
integer m
m - hi < alpha * d < m + lo
then alpha * d is equal to m exactly.
Theorems frac-alpha-d-nonzero-bound-dp-correct and
frac-alpha-d-nonzero-bound-sp-correct explicitly
state bounds lo and hi for float and double Java formats.
This fact is proved using continous fractions.

Raffaello derived from this theorem that a table of powers of 5 with
126-bit precision is enough
to correctly implement DoubleToString conversion.
He designed a new implementation of DoubleToString and FloatToString and
presented it in this core-libs-dev thread.

It was exciting to know that Ulf Adama also developed fast conversion
algorithm.
New Raffaello's and Ulf's algorithms looks similar.
It would be interesting to compare them, to match similarities and to think
on differences.
For example the problem of finding "lo" and "hi" is formulated as computing
minimum and maximum of a modular product in Ilf's paper.
It would be interesting to check if our lo-hi bounds are scaled versions of
Ulf's minimum and maximum.
Ideally it would be interesting to formally verify both of them though I am
not sure if I find time for this.

If Raffaello prepares a new paper with proof of his new algorithm then I
will try to check these new proofs by ACL2 again.
I hope that paper with its formal check will increase confidence.

---

Andrew Dinn said about other problem: "how details of specific coding
operations are not always clearly
identifiable from an abstract mathematical treatment.". Formal verification
is applicable for this also.
It can be done either at Java language level or at classfile level.

Verification at classfile level requires formal model of JVM.
There are at least three models of JVM in ACL2:
i) An M5 model in the distribution of ACL2 books: [4]
ii) Defensive JVM model by CLI: [5]
iii) M6 model by Hamling Liu: [6]

Verification at Java language level can be done using other tools,
especially the KeY tool: [7].
TimSort bug [8][ has been found during to attempt to verify this
alogorithm's implementation by KeY.

Unfortunately I am not sure that I find time to verify implementation
(classfile or Java-file) of the new Raffaello algorithm.

---

[1] http://www.cs.utexas.edu/~moore/acl2/

[2] https://github.com/nadezhin/verify-todec

[3]
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.85.8351&rep=rep1&type=pdf#page=114

[4] https://github.com/acl2/acl2/tree/master/books/models/jvm/m5

[5] http://www.computationallogic.com/software/djvm/index.html

[6] https://github.com/haliu/M6

[7] https://www.key-project.org/

[8]
http://www.envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/

On Fri, Sep 28, 2018 at 1:11 AM Brian Burkhalter <
brian.burkhal...@oracle.com> wrote:

> On Sep 27, 2018, at 9:00 AM, Raffaello Giulietti <
> raffaello.giulie...@gmail.com> wrote:
>
> > On 2018-09-27 17:53, Andrew Dinn wrote:
> >> On 27/09/18 16:37, Raffaello Giulietti wrote:
> >>   . . .
> >>> I thank you in advance for your willingness to review the code but my
> >>> understanding is that only the officially appointed reviewers can
> >>> approve OpenJDK contributions, which is of course a good policy.
> >>> Besides, as two Andrews engineers from RedHat correctly observe,
> >>> understanding the rationale of the code without the planned
> accompanying
> >>> paper is hard.
> >> Oh no, let me stop you right

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread Brian Burkhalter

On Sep 27, 2018, at 9:00 AM, Raffaello Giulietti 
 wrote:

> On 2018-09-27 17:53, Andrew Dinn wrote:
>> On 27/09/18 16:37, Raffaello Giulietti wrote:
>>   . . .
>>> I thank you in advance for your willingness to review the code but my
>>> understanding is that only the officially appointed reviewers can
>>> approve OpenJDK contributions, which is of course a good policy.
>>> Besides, as two Andrews engineers from RedHat correctly observe,
>>> understanding the rationale of the code without the planned accompanying
>>> paper is hard.
>> Oh no, let me stop you right there! Anyone competent can offer a review
>> (well incompetent people can too but let's assume they get ignored :-).
>> Indeed, an expert critique is always very welcome and reviewers normally
>> get credited even if they have no official status. The official
>> reviewers are needed for a final say so. No one sensible is going to
>> reject clear and clearly justified advice.
> 
> An excellent policy, indeed!

I was going to reply with essentially the same comments. While an imprimatur 
from at least one sanctioned OpenJDK reviewer is required eventually, comments 
and review from any competent parties are most welcome. This is especially the 
case for a relatively complicated topic such as the one of this thread.

 I'm not a contributor to the Jdk, and this isn't my full-time job. I
 was lurking here because I was going to send a patch for the double to
 string conversion code myself (based on Ryu).

>>> 
>>> All my efforts on this projects are done in my unpaid spare time, too.
>> Which is very much appreciated, thank you.
> 
> Thank all of you for your interest in this issue.

I would like to second that. It is great to see so many comments on this topic 
and to have knowledgable people paying attention.

Brian

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread Raffaello Giulietti


On 2018-09-27 17:53, Andrew Dinn wrote:

On 27/09/18 16:37, Raffaello Giulietti wrote:
   . . .

I thank you in advance for your willingness to review the code but my
understanding is that only the officially appointed reviewers can
approve OpenJDK contributions, which is of course a good policy.
Besides, as two Andrews engineers from RedHat correctly observe,
understanding the rationale of the code without the planned accompanying
paper is hard.


Oh no, let me stop you right there! Anyone competent can offer a review
(well incompetent people can too but let's assume they get ignored :-).
Indeed, an expert critique is always very welcome and reviewers normally
get credited even if they have no official status. The official
reviewers are needed for a final say so. No one sensible is going to
reject clear and clearly justified advice.



An excellent policy, indeed!




I'm not a contributor to the Jdk, and this isn't my full-time job. I
was lurking here because I was going to send a patch for the double to
string conversion code myself (based on Ryu).



All my efforts on this projects are done in my unpaid spare time, too.

Which is very much appreciated, thank you.



Thank all of you for your interest in this issue.


Greetings
Raffaello

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread Andrew Dinn

On 27/09/18 16:37, Raffaello Giulietti wrote:
  . . .
> I thank you in advance for your willingness to review the code but my
> understanding is that only the officially appointed reviewers can
> approve OpenJDK contributions, which is of course a good policy.
> Besides, as two Andrews engineers from RedHat correctly observe,
> understanding the rationale of the code without the planned accompanying
> paper is hard.

Oh no, let me stop you right there! Anyone competent can offer a review
(well incompetent people can too but let's assume they get ignored :-).
Indeed, an expert critique is always very welcome and reviewers normally
get credited even if they have no official status. The official
reviewers are needed for a final say so. No one sensible is going to
reject clear and clearly justified advice.

>> I'm not a contributor to the Jdk, and this isn't my full-time job. I
>> was lurking here because I was going to send a patch for the double to
>> string conversion code myself (based on Ryu).
>>
> 
> All my efforts on this projects are done in my unpaid spare time, too.
Which is very much appreciated, thank you.

regards,


Andrew Dinn
---
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread Raffaello Giulietti

Hi Ulf,

On 2018-09-27 16:40, Ulf Adams wrote:

Hi Raffaello,

I am the author of a recent publication on double to string conversion 
[1] - the Ryu algorithm. I've been aware of the problems with the Jdk 
for several years, and am very much looking forward to improvements in 
correctness and performance in this area.

What a coincidence! I'm happy to hear that the quest for better 
floating->string conversions has not stopped. Tomorrow I'll download 
your paper and have a look at it during the weekend.

I have done some testing against my Java implementation of the Ryu 
algorithm described in the linked paper. Interestingly, I've found a few 
cases where they output different results. In particular:

1.0E-323 is printed as 9.9E-324
1.0E-322 is printed as 9.9E-323

If Ryu also produces 1 digit long outputs, then your results above are 
correct. But then Ryu should also output 5.0E-324 rather than 4.9E-324, 
for example.
Even better, it should output 5E-324, 1E-323 and 1E-322 because adding 
the .0 part might confuse a human reader to believe that 2 digits are 
really needed. But then 4.9E-324, 9.9E-324 and 9.9E-323 are closer to 
the double.

2 digits are for backward compatibility with the existing spec which 
requires at least one digit to the right of the decimal point.

It's likely that there are more such cases - I only ran a sample of 
double-precision numbers. Arguably, 9.9 is the correctly rounded 2-digit 
output and Ryu is incorrect here. That's what you get when you have a 
special case for Java without a correctness proof. :-(

In terms of performance, this algorithm performs almost exactly the same 
as my Java implementation of Ryu, although I'd like to point out that my 
C implementation of Ryu is quite a bit faster (though note that it 
generates different output, in particular, it only outputs a single 
digit of precision in the above cases, rather than two), and I didn't 
backport all the performance improvements from the Java version, yet. It 
looks like this is not coincidence - as far as I can see so far, it's 
algorithmically very similar, although it manages to avoid the loop I'm 
using in Ryu to find the shortest representation.

I have a few comments:

      *      It rounds to {@code v} according to the usual 
round-to-closest

      *     rule of IEEE 754 floating-point arithmetic.
- Since you're spelling out the rounding rules just below, this is 
duplicated, and by itself, it's unclear since it doesn't specify the 
specific sub-type (round half even).

I tried to save as much of the original spec wording as possible. 
Perhaps it isn't worthwhile.

- Naming: I'd strongly suggest to use variable names that relate to 
what's stored, e.g., m for mantissa, e for exponent, etc.

I currently prefer to be consistent with a forthcoming paper of mine on 
the subject. But thanks for the suggestion.

- What's not clear to me is how the algorithm determines how many digits 
to print.

You'll have to wait for the paper.

- Also, it might be nicer to move the long multiplications to a helper 
method - at least from a short look, it looks like the computations of 
vn, vnl, and vnr are identical.

I tried several variants: the current one seems to be the faster with 
the current optimizations of C2. Some day I'll also try with Graal.

- I looked through the spec, and it looks like all cases are 
well-defined. Yay!

I will need some more time to do a more thorough review of the code and 
more testing for differences. Unfortunately, I'm also traveling the next 
two weeks, so this might take a bit of time.

I thank you in advance for your willingness to review the code but my 
understanding is that only the officially appointed reviewers can 
approve OpenJDK contributions, which is of course a good policy. 
Besides, as two Andrews engineers from RedHat correctly observe, 
understanding the rationale of the code without the planned accompanying 
paper is hard.

I'm not a contributor to the Jdk, and this isn't my full-time job. I was 
lurking here because I was going to send a patch for the double to 
string conversion code myself (based on Ryu).

All my efforts on this projects are done in my unpaid spare time, too.

Thanks,

-- Ulf

> [1] https://dl.acm.org/citation.cfm?id=3192369
> [2] https://github.com/google/double-conversion
> [3] https://en.wikipedia.org/wiki/Rounding
>

Thank you
Raffaello

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread Raffaello Giulietti


On 2018-09-27 16:55, Andrew Dinn wrote:

Hi Raffaello,

On 27/09/18 15:20, raffaello.giulie...@gmail.com wrote:

Hi Andrew,
On the other side, in April this year I submitted another quite fast and
supposedly correct algorithm on this mailing list and I referred to an
accompanying paper by myself that gives full explanations on that
variant. Except for a couple of persons in private, nobody cared to send
me any observation or comment, neither on the code nor on the paper.


I'm sorry I didn't see that post. I would have been very happy to review
the paper as well as the code. Unfortunately, none of us have time to
catch everything and we certainly don't always see every contribution.



I understand that most people are busy, so I was not really surprised 
not to get feedback on a rather tiny issue in the overall huge codebase.





The present algorithm is superior. I have the theory in notes, in my
head, on napkins, on paper sheets all over my desk and floors. But
rather than spending time on the paper itself, like I did almost in vain
for the April variant, I preferred investing it in coding, for several
reasons:
* Only code executes, not a paper.
* Only code gives results that can be compared against.
* Only code can give indications on performance enhancements.
* Only code is interesting to be submitted to the OpenJDK.
* Having a paper without having tried the ideas in code is half the fun
and half as useful.


I think this only presents one side of the argument here. For code of
anything but the most basic complexity. Assuming that by paper you mean
anything that goes beyond executable statements, including comments,
list discussions and reviews like this one, design notes and documents,
specifications et al



My point is about priorities and the past experience with almost zero 
feedback on the former implementation, not that a paper isn't due or 
useless. On the contrary, I'm the first that would not trust my own code 
without an explanation.





Only a paper tells you what an executing piece of code is actually doing

Only paper tells you what the results produced by that code need to be
compared against to determine correctness, accuracy, etc

Only paper can tell you whether achieved performance is worse than or
better than can be expected (or where in between it lies)

Only paper can explain what OpenJDK is supposed to be doing, why and how
the specific elements of the implementation achieve that what/why i.e.
withouth that audit trail OpenJDK will be dead in the water in no time
at all

Having code without the paper to tell you what ideas it implements is no
fun and n use at all.



Why "no use at all"? That's unfair.

It might not be fun currently, but it is quite useful anyway in its 
present form, having produced, as of today, some 400 billions correct 
results in 1/13 of the time.





I think that last one exemplifies a key asymmetry that always needs to
be borne in mind. If your last contribution did not get any signifcant
review on this or some other list then I think we really messed up.



Except where noted above, I agree with these observations. Currently, 
however, I'm in the lucky position to have both the explanation and the 
code, so they don't apply for myself in this particular case.


I'm thinking on how to present the ideas in some sketchy form, just to 
share the fun and mathematically convince the inclined ones, before the 
paper is ready.




Greetings
Raffaello

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread Andrew Dinn

Hi Raffaello,

On 27/09/18 15:20, raffaello.giulie...@gmail.com wrote:
> Hi Andrew,
> On the other side, in April this year I submitted another quite fast and
> supposedly correct algorithm on this mailing list and I referred to an
> accompanying paper by myself that gives full explanations on that
> variant. Except for a couple of persons in private, nobody cared to send
> me any observation or comment, neither on the code nor on the paper.

I'm sorry I didn't see that post. I would have been very happy to review
the paper as well as the code. Unfortunately, none of us have time to
catch everything and we certainly don't always see every contribution.

> The present algorithm is superior. I have the theory in notes, in my
> head, on napkins, on paper sheets all over my desk and floors. But
> rather than spending time on the paper itself, like I did almost in vain
> for the April variant, I preferred investing it in coding, for several
> reasons:
> * Only code executes, not a paper.
> * Only code gives results that can be compared against.
> * Only code can give indications on performance enhancements.
> * Only code is interesting to be submitted to the OpenJDK.
> * Having a paper without having tried the ideas in code is half the fun
> and half as useful.

I think this only presents one side of the argument here. For code of
anything but the most basic complexity. Assuming that by paper you mean
anything that goes beyond executable statements, including comments,
list discussions and reviews like this one, design notes and documents,
specifications et al

Only a paper tells you what an executing piece of code is actually doing

Only paper tells you what the results produced by that code need to be
compared against to determine correctness, accuracy, etc

Only paper can tell you whether achieved performance is worse than or
better than can be expected (or where in between it lies)

Only paper can explain what OpenJDK is supposed to be doing, why and how
the specific elements of the implementation achieve that what/why i.e.
withouth that audit trail OpenJDK will be dead in the water in no time
at all

Having code without the paper to tell you what ideas it implements is no
fun and n use at all.

I think that last one exemplifies a key asymmetry that always needs to
be borne in mind. If your last contribution did not get any signifcant
review on this or some other list then I think we really messed up.

regards,

Andrew Dinn
---
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread Ulf Adams

Hi Raffaello,

I am the author of a recent publication on double to string conversion [1]
- the Ryu algorithm. I've been aware of the problems with the Jdk for
several years, and am very much looking forward to improvements in
correctness and performance in this area.

I have done some testing against my Java implementation of the Ryu
algorithm described in the linked paper. Interestingly, I've found a few
cases where they output different results. In particular:
1.0E-323 is printed as 9.9E-324
1.0E-322 is printed as 9.9E-323

It's likely that there are more such cases - I only ran a sample of
double-precision numbers. Arguably, 9.9 is the correctly rounded 2-digit
output and Ryu is incorrect here. That's what you get when you have a
special case for Java without a correctness proof. :-(

In terms of performance, this algorithm performs almost exactly the same as
my Java implementation of Ryu, although I'd like to point out that my C
implementation of Ryu is quite a bit faster (though note that it generates
different output, in particular, it only outputs a single digit of
precision in the above cases, rather than two), and I didn't backport all
the performance improvements from the Java version, yet. It looks like this
is not coincidence - as far as I can see so far, it's algorithmically very
similar, although it manages to avoid the loop I'm using in Ryu to find the
shortest representation.

I have a few comments:

 *  It rounds to {@code v} according to the usual
round-to-closest
 * rule of IEEE 754 floating-point arithmetic.
- Since you're spelling out the rounding rules just below, this is
duplicated, and by itself, it's unclear since it doesn't specify the
specific sub-type (round half even).

- Naming: I'd strongly suggest to use variable names that relate to what's
stored, e.g., m for mantissa, e for exponent, etc.

- What's not clear to me is how the algorithm determines how many digits to
print.

- Also, it might be nicer to move the long multiplications to a helper
method - at least from a short look, it looks like the computations of vn,
vnl, and vnr are identical.

- I looked through the spec, and it looks like all cases are well-defined.
Yay!

I will need some more time to do a more thorough review of the code and
more testing for differences. Unfortunately, I'm also traveling the next
two weeks, so this might take a bit of time.

I'm not a contributor to the Jdk, and this isn't my full-time job. I was
lurking here because I was going to send a patch for the double to string
conversion code myself (based on Ryu).

Thanks,

-- Ulf

[1] https://dl.acm.org/citation.cfm?id=3192369
[2] https://github.com/google/double-conversion
[3] https://en.wikipedia.org/wiki/Rounding

On Thu, Sep 27, 2018 at 11:03 AM Andrew Haley  wrote:

> On 09/26/2018 06:39 PM, raffaello.giulie...@gmail.com wrote:
>
> > The submitted code contains both the changes to the current
> > implementation and extensive jtreg tests.
> >
> > While I've struggled to keep the code within the 80 chars/line limit,
> > mercurial still generates longer lines. Thus, to avoid possible problems
> > with the email systems, the code is submitted both inline and as an
> > attachment. Hope at least one copy makes its way without errors.
>
> Overall, the commenting is much too light. There are many places
> where I think I know what you're doing but you don't explain it.
>
> Here, for example:
>
> > +
> > +// pow5 = pow51*2^63 + pow50
> > +long pow51 = ceilPow5dHigh(-k);
> > +long pow50 = ceilPow5dLow(-k);
> > +
> > +// p = p2*2^126 + p1*2^63 + p0 and p = pow5 * cb
> > +long x0 = pow50 * cb;
> > +long x1 = multiplyHigh(pow50, cb);
> > +long y0 = pow51 * cb;
> > +long y1 = multiplyHigh(pow51, cb);
> > +long z = (x1 << 1 | x0 >>> 63) + (y0 & MASK_63);
> > +long p0 = x0 & MASK_63;
> > +long p1 = z & MASK_63;
> > +long p2 = (y1 << 1 | y0 >>> 63) + (z >>> 63);
> > +long vn = p2 << 1 + ord2alpha | p1 >>> 62 - ord2alpha;
> > +if ((p1 & mask) != 0 || p0 >= threshold) {
> > +vn |= 1;
> > +}
>
> ... etc. I think I can figure out what you're doing, but you could
> explain it.
>
> If you write the comments now while the code is still fresh in your
> mind it'll be easy.
>
> > +private static final long[] ceilPow5d = {
> > +/* -292 */ 0x7FBB_D8FE_5F5E_6E27L, 0x497A_3A27_04EE_C3DFL,
> > +/* -291 */ 0x4FD5_679E_FB9B_04D8L, 0x5DEC_6458_6315_3A6CL,
> > +/* -290 */ 0x63CA_C186_BA81_C60EL, 0x7567_7D6E_7BDA_8906L,
> > +/* -289 */ 0x7CBD_71E8_6922_3792L, 0x52C1_5CCA_1AD1_2B48L,
>
> What exactly is this table anyway?  How was it generated?  Please say.
>
> There are many more places in the code. What you've done is nice, but
> it could be exemplary.
>
> --
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd. 
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread raffaello . giulietti

Hi Andrew,

On 2018-09-27 15:28, Andrew Haley wrote:
> On 09/27/2018 01:23 PM, Raffaello Giulietti wrote:
> 
>> In principle I agree with you.
>>
>> However, in this case the maths underlying the algorithm to select the 
>> decimal are too involved to explain in comment form. I'm in the course 
>> of preparing a paper that explains the idea and the details. Then it 
>> should be easier to make sense out of the code.
> 
> In which case, the code must contain pointers to the relevant parts of
> the paper.
> 

Sure.

>> Since to my knowledge the algorithm is novel, it will require some time 
>> for me to translate in paper form.
> 
> Sure, but how do you expect anyone to review your code without the
> necessary explanation? Do you think that people should approve your
> patch without that paper?
> 

I've no idea on how OpenJDK code is reviewed, so I cannot honestly
answer your questions. What I can tell is that if I were a reviewer, I
wouldn't trust rather involved code like mine without an explanation,
exactly as you seem inclined to do.

On the other side, in April this year I submitted another quite fast and
supposedly correct algorithm on this mailing list and I referred to an
accompanying paper by myself that gives full explanations on that
variant. Except for a couple of persons in private, nobody cared to send
me any observation or comment, neither on the code nor on the paper.

The present algorithm is superior. I have the theory in notes, in my
head, on napkins, on paper sheets all over my desk and floors. But
rather than spending time on the paper itself, like I did almost in vain
for the April variant, I preferred investing it in coding, for several
reasons:
* Only code executes, not a paper.
* Only code gives results that can be compared against.
* Only code can give indications on performance enhancements.
* Only code is interesting to be submitted to the OpenJDK.
* Having a paper without having tried the ideas in code is half the fun
and half as useful.

It's only a matter of priorities and of limited time, my spare time.
That said, now that I have some feedback and that the code has been
exercised at least some 400 billions times on doubles without errors and
on all 2^32 floats without errors, I now know that there is some
interest and that it is worth pushing energies in the paper.

>> There are some succinct comments here that explain the expected
>> results.  I'm not the kind of programmer that comments every line
>> since here the mechanics is simple enough to follow in Java
>> directly. A good explanation would either be mathematical, which
>> requires better typography than US-ASCII, or some explanatory
>> drawings.
> 
> Sure, these can be offline somewhere. The difficulty is always the
> mapping from the mathematics onto the Java code, and this is what must
> be explained in the comments.
> 

... or in the paper.

Greetings
Raffaello

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread raffaello . giulietti

Hi Andrew,


On 2018-09-27 14:57, Andrew Dinn wrote:
> On 27/09/18 13:23, Raffaello Giulietti wrote:
>   . . .
>> The comments of the accessor methods that make use of this private table
>> implicitly explain its semantics as well. I will add a comment to the
>> field that refers to the comments in the methods.
>>
>>
>> How the table was generated and thoroughly verified is currently not
>> part of my contribution to OpenJDK, not because it is something secret
>> or complex but because I think it is irrelevant here.
> 
> I suspect it would be far from irrelevant to someone faced with
> debugging and fixing any future problem found in the code. Of course,
> you may well still be around to fix it but we don't know that for sure.
> 
>> Besides, where would the generator/verifier code be placed in the
>> codebase? It would be completely detached from, and unrelated to,
>> everything else. But maybe there is already some mechanism in place for
>> similar "bootstrapping" code in the OpenJDK. Then I would like to know
>> to consider adding the generator there.
> 
> We don't absolutely need generator/verifier code (although the latter
> might be helpful). The problem we face is what to do if a bug is found.
> Could it simply be a table entry that is wrong or is there a detail of
> the algorithm that has been mis-encoded? How would we tell?
> > If we have an explanation of /how/ the provided values were derived and
> /why/ so derived then that would allow any such bugs to be resolved much
> more easily. Omitting that background risks turning this into a one-shot
> code drop rather than a very welcome contribution.
> 

As mentioned in my previous post, the explanation of *what* the table
represents is implicitly and rigorously explained in the comments of the
methods that make use of it. That should make it easy to verify that the
table is sound.

As of the *how*, the generator is based on BigInteger arithmetic and
both generates and verifies the results. What I can do is to add the
code in the MathUtils class itself. It would never be run except for
diagnosis. But then one has to trust BigInteger.

The *why* will be part of the paper.



>>> There are many more places in the code. What you've done is nice, but
>>> it could be exemplary.
>>>
>>
>> As said, this will be part of a separate paper. Hope this helps for the
>> moment.
> Sure, a reference to a published doc would be great - assuming it makes
> it clear how the quite code is derived from the maths/algorithm it
> details. I'd really much prefer to have that doc before accepting the
> code just in order to be sure that there is no gap between theory and
> execution that some judicious commenting might close. Having recently
> reviewed some math code for log, trig and power functions I am well
> aware how details of specific coding operations are not always clearly
> identifiable from an abstract mathematical treatment. Even when they can
> be derived a few comments in the code often help avoid any resort to a
> pencil and thick pad of graph paper.
> 
> When you do post a link to the paper I'll be willing to check it and
> hope that I will, if needed, be able to ask you for advice to help
> clarify any such gap.
> 

The paper will explain the theory and the details.

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread Andrew Haley

On 09/27/2018 01:23 PM, Raffaello Giulietti wrote:

> In principle I agree with you.
> 
> However, in this case the maths underlying the algorithm to select the 
> decimal are too involved to explain in comment form. I'm in the course 
> of preparing a paper that explains the idea and the details. Then it 
> should be easier to make sense out of the code.

In which case, the code must contain pointers to the relevant parts of
the paper.

> Since to my knowledge the algorithm is novel, it will require some time 
> for me to translate in paper form.

Sure, but how do you expect anyone to review your code without the
necessary explanation? Do you think that people should approve your
patch without that paper?

> There are some succinct comments here that explain the expected
> results.  I'm not the kind of programmer that comments every line
> since here the mechanics is simple enough to follow in Java
> directly. A good explanation would either be mathematical, which
> requires better typography than US-ASCII, or some explanatory
> drawings.

Sure, these can be offline somewhere. The difficulty is always the
mapping from the mathematics onto the Java code, and this is what must
be explained in the comments.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. 
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread Andrew Dinn

On 27/09/18 13:23, Raffaello Giulietti wrote:
  . . .
> The comments of the accessor methods that make use of this private table
> implicitly explain its semantics as well. I will add a comment to the
> field that refers to the comments in the methods.
> 
> 
> How the table was generated and thoroughly verified is currently not
> part of my contribution to OpenJDK, not because it is something secret
> or complex but because I think it is irrelevant here.

I suspect it would be far from irrelevant to someone faced with
debugging and fixing any future problem found in the code. Of course,
you may well still be around to fix it but we don't know that for sure.

> Besides, where would the generator/verifier code be placed in the
> codebase? It would be completely detached from, and unrelated to,
> everything else. But maybe there is already some mechanism in place for
> similar "bootstrapping" code in the OpenJDK. Then I would like to know
> to consider adding the generator there.

We don't absolutely need generator/verifier code (although the latter
might be helpful). The problem we face is what to do if a bug is found.
Could it simply be a table entry that is wrong or is there a detail of
the algorithm that has been mis-encoded? How would we tell?

If we have an explanation of /how/ the provided values were derived and
/why/ so derived then that would allow any such bugs to be resolved much
more easily. Omitting that background risks turning this into a one-shot
code drop rather than a very welcome contribution.

>> There are many more places in the code. What you've done is nice, but
>> it could be exemplary.
>>
> 
> As said, this will be part of a separate paper. Hope this helps for the
> moment.
Sure, a reference to a published doc would be great - assuming it makes
it clear how the quite code is derived from the maths/algorithm it
details. I'd really much prefer to have that doc before accepting the
code just in order to be sure that there is no gap between theory and
execution that some judicious commenting might close. Having recently
reviewed some math code for log, trig and power functions I am well
aware how details of specific coding operations are not always clearly
identifiable from an abstract mathematical treatment. Even when they can
be derived a few comments in the code often help avoid any resort to a
pencil and thick pad of graph paper.

When you do post a link to the paper I'll be willing to check it and
hope that I will, if needed, be able to ask you for advice to help
clarify any such gap.

regards,

Andrew Dinn
---
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread Raffaello Giulietti

Hi Andrew,

In principle I agree with you.

However, in this case the maths underlying the algorithm to select the 
decimal are too involved to explain in comment form. I'm in the course 
of preparing a paper that explains the idea and the details. Then it 
should be easier to make sense out of the code.

Since to my knowledge the algorithm is novel, it will require some time 
for me to translate in paper form.

Other observations are interspersed with yours below.

On 2018-09-27 11:03, Andrew Haley wrote:

On 09/26/2018 06:39 PM, raffaello.giulie...@gmail.com wrote:

The submitted code contains both the changes to the current
implementation and extensive jtreg tests.

While I've struggled to keep the code within the 80 chars/line limit,
mercurial still generates longer lines. Thus, to avoid possible problems
with the email systems, the code is submitted both inline and as an
attachment. Hope at least one copy makes its way without errors.

Overall, the commenting is much too light. There are many places
where I think I know what you're doing but you don't explain it.

Here, for example:

+
+// pow5 = pow51*2^63 + pow50
+long pow51 = ceilPow5dHigh(-k);
+long pow50 = ceilPow5dLow(-k);
+
+// p = p2*2^126 + p1*2^63 + p0 and p = pow5 * cb
+long x0 = pow50 * cb;
+long x1 = multiplyHigh(pow50, cb);
+long y0 = pow51 * cb;
+long y1 = multiplyHigh(pow51, cb);
+long z = (x1 << 1 | x0 >>> 63) + (y0 & MASK_63);
+long p0 = x0 & MASK_63;
+long p1 = z & MASK_63;
+long p2 = (y1 << 1 | y0 >>> 63) + (z >>> 63);
+long vn = p2 << 1 + ord2alpha | p1 >>> 62 - ord2alpha;
+if ((p1 & mask) != 0 || p0 >= threshold) {
+vn |= 1;
+}

... etc. I think I can figure out what you're doing, but you could
explain it.

> If you write the comments now while the code is still fresh in your
> mind it'll be easy.
>

There are some succinct comments here that explain the expected results. 
I'm not the kind of programmer that comments every line since here the 
mechanics is simple enough to follow in Java directly. A good 
explanation would either be mathematical, which requires better 
typography than US-ASCII, or some explanatory drawings.

What the semantics of vn, vnl and vnr are will be explained in the 
future paper mentioned above.

+private static final long[] ceilPow5d = {
+/* -292 */ 0x7FBB_D8FE_5F5E_6E27L, 0x497A_3A27_04EE_C3DFL,
+/* -291 */ 0x4FD5_679E_FB9B_04D8L, 0x5DEC_6458_6315_3A6CL,
+/* -290 */ 0x63CA_C186_BA81_C60EL, 0x7567_7D6E_7BDA_8906L,
+/* -289 */ 0x7CBD_71E8_6922_3792L, 0x52C1_5CCA_1AD1_2B48L,

What exactly is this table anyway?  How was it generated?  Please say.

The comments of the accessor methods that make use of this private table 
implicitly explain its semantics as well. I will add a comment to the 
field that refers to the comments in the methods.

How the table was generated and thoroughly verified is currently not 
part of my contribution to OpenJDK, not because it is something secret 
or complex but because I think it is irrelevant here.

Besides, where would the generator/verifier code be placed in the 
codebase? It would be completely detached from, and unrelated to, 
everything else. But maybe there is already some mechanism in place for 
similar "bootstrapping" code in the OpenJDK. Then I would like to know 
to consider adding the generator there.

There are many more places in the code. What you've done is nice, but
it could be exemplary.

As said, this will be part of a separate paper. Hope this helps for the 
moment.

Thanks for your interest
Raffaello

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread Andrew Haley

On 09/26/2018 06:39 PM, raffaello.giulie...@gmail.com wrote:

> The submitted code contains both the changes to the current
> implementation and extensive jtreg tests.
>
> While I've struggled to keep the code within the 80 chars/line limit,
> mercurial still generates longer lines. Thus, to avoid possible problems
> with the email systems, the code is submitted both inline and as an
> attachment. Hope at least one copy makes its way without errors.

Overall, the commenting is much too light. There are many places
where I think I know what you're doing but you don't explain it.

Here, for example:

> +
> +// pow5 = pow51*2^63 + pow50
> +long pow51 = ceilPow5dHigh(-k);
> +long pow50 = ceilPow5dLow(-k);
> +
> +// p = p2*2^126 + p1*2^63 + p0 and p = pow5 * cb
> +long x0 = pow50 * cb;
> +long x1 = multiplyHigh(pow50, cb);
> +long y0 = pow51 * cb;
> +long y1 = multiplyHigh(pow51, cb);
> +long z = (x1 << 1 | x0 >>> 63) + (y0 & MASK_63);
> +long p0 = x0 & MASK_63;
> +long p1 = z & MASK_63;
> +long p2 = (y1 << 1 | y0 >>> 63) + (z >>> 63);
> +long vn = p2 << 1 + ord2alpha | p1 >>> 62 - ord2alpha;
> +if ((p1 & mask) != 0 || p0 >= threshold) {
> +vn |= 1;
> +}

... etc. I think I can figure out what you're doing, but you could
explain it.

If you write the comments now while the code is still fresh in your
mind it'll be easy.

> +private static final long[] ceilPow5d = {
> +/* -292 */ 0x7FBB_D8FE_5F5E_6E27L, 0x497A_3A27_04EE_C3DFL,
> +/* -291 */ 0x4FD5_679E_FB9B_04D8L, 0x5DEC_6458_6315_3A6CL,
> +/* -290 */ 0x63CA_C186_BA81_C60EL, 0x7567_7D6E_7BDA_8906L,
> +/* -289 */ 0x7CBD_71E8_6922_3792L, 0x52C1_5CCA_1AD1_2B48L,

What exactly is this table anyway?  How was it generated?  Please say.

There are many more places in the code. What you've done is nice, but
it could be exemplary.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. 
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-27 Thread Raffaello Giulietti


Hi Brian,

On 2018-09-27 01:53, Brian Burkhalter wrote:

There was a compilation error on Linux in one of the tests:

test/jdk/java/lang/Floating/DoubleToDecString.java:133: error: unmappable 
character (0x93) for encoding US-ASCII
 Paxson V, "A Program for Testing IEEE Decimal\ufffd\ufffd\ufffdBinary 
Conversion"

This was only in one of the comments where it looks like some errant character 
leaked in. I updated the webrev (with accompanying patch) in place [1] after 
verifying that the change to the test fixes the problem.



I can confirm that in my original source there is a 'EN DASH' (U+2013) 
Unicode character, which visually looks similar to 'HYPHEN-MINUS' 
(U+002D). I use UTF-8 on all my source files, so it didn't stand out as 
something strange in the IDE.


This is certainly due to some copy&paste operation from a source found 
on the internet. U+2013 is the correct variant, both semantically and 
typographically, but it surely causes problems with tools that expect 
US-ASCII. Frankly, there should be no such restrictions in tools, as 
Java supports Unicode source code since day 0, but that's another story.


For this project, I will switch to US-ASCII in my IDE.




Also, there were a couple of failures observed in test [2] at lines 172 and 173. If at 
line 172 "foo1.17549435E-38” is changed to "foo1.1754944E-38” then the 
evaluation at that line succeeds but then the one at line 173 fails. Making a similar 
change on this line does not help. I suspect that this is due to a difference between the 
new code and that in jdk.internal.math.FloatingDecimal which is used by 
java.lang.AbstractStringBuilder and java.lang.invoke.StringConcatFactory but I’ve not 
actually investigated this as yet. I had actually wondered whether some or all of the 
FloatingDecimal code could be superseded by the updated logic.



"1.1754944E-38" is shorter than "1.17549435E-38" and can still recover 
Float.MIN_NORMAL. That's why the new code returns the shorter variant.


Why this works when the replacement is made in line 172 but not in line 
173 is really counter-intuitive. The only superficial difference is that 
FLOAT_MIN_NORM_1 is declared final while FLOAT_MIN_NORM_2 is not, 
although both are initialized with the same value.


The black-box conclusion is that it seems that string concatenation 
performed at compile time and at runtime manages to produce different 
results when given the same values. This is of course a problem outside 
the scope of the Double/Float conversions. As you point out, a more 
white-box analysis might reveal the culprit in the deadly embrace 
between string building and FloatingDecimal.


Unfortunately, I won't have time to investigate this interesting issue 
further before the weekend.



Greetings
Raffaello





Brian

[1] http://cr.openjdk.java.net/~bpb/4511638/webrev.00/
[2] java/lang/String/concat/ImplicitStringConcatBoundaries.java

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-26 Thread Brian Burkhalter

There was a compilation error on Linux in one of the tests:

test/jdk/java/lang/Floating/DoubleToDecString.java:133: error: unmappable 
character (0x93) for encoding US-ASCII
Paxson V, "A Program for Testing IEEE Decimal\ufffd\ufffd\ufffdBinary 
Conversion"

This was only in one of the comments where it looks like some errant character 
leaked in. I updated the webrev (with accompanying patch) in place [1] after 
verifying that the change to the test fixes the problem. 

Also, there were a couple of failures observed in test [2] at lines 172 and 
173. If at line 172 "foo1.17549435E-38” is changed to "foo1.1754944E-38” then 
the evaluation at that line succeeds but then the one at line 173 fails. Making 
a similar change on this line does not help. I suspect that this is due to a 
difference between the new code and that in jdk.internal.math.FloatingDecimal 
which is used by java.lang.AbstractStringBuilder and 
java.lang.invoke.StringConcatFactory but I’ve not actually investigated this as 
yet. I had actually wondered whether some or all of the FloatingDecimal code 
could be superseded by the updated logic.

Brian

[1] http://cr.openjdk.java.net/~bpb/4511638/webrev.00/
[2] java/lang/String/concat/ImplicitStringConcatBoundaries.java

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-26 Thread Brian Burkhalter

On Sep 26, 2018, at 11:24 AM, Brian Burkhalter  
wrote:

>> Since there's a change in the specification, according to my sponsor and
>> the formalities it has to undergo a CSR.
> 
> I will update the CSR.

I updated the CSR [1] with the new verbiage for Double.toString() and 
Float.toString() and attached a specdiff versus the current JDK 12 repository. 
Please let me know of any errors.

Thanks,

Brian

[1] https://bugs.openjdk.java.net/browse/JDK-8202555

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

2018-09-26 Thread Brian Burkhalter

Hi Raffaello,

Here is the webrev version of the patch:

http://cr.openjdk.java.net/~bpb/4511638/webrev.00/

On Sep 26, 2018, at 10:39 AM, raffaello.giulie...@gmail.com wrote:

> To address and overcome both issues, a new specification in the form of
> Javadoc associated to Double.toString(double) and Float.toString(float)
> is proposed, along with a clean-room re-implementation.

Thank you for the contribution.

> The intent of the specification is to explicitly separate the
> determination of the unique decimal number to represent the floating
> point argument (double/float) from its formatting as a String.
> 
> The clean-room implementation enjoys the following properties:
> * […]
> * The new Double.toString(double) speedup with respect to the current
> implementation, according to jmh, is better than 13x when measured over
> bitwise uniformly distributed random samples.

I have corroborated this performance improvement on my MacBook Pro.

> * In the case of Double.toString(double), as it is infeasible to test
> all results, several billions of random doubles have been extensively
> tested.

I likewise ran a “round trip” test similar to

Random.nextLong() -> Double.longBitsToDouble -> d0
d0 -> Double.toString -> Double.valueOf -> d1
Compare d0 vs. d1

which ran for 40 hours (366 billion round trips) on my MacBook Pro without 
error.

> Since there's a change in the specification, according to my sponsor and
> the formalities it has to undergo a CSR.

I will update the CSR.

> The submitted code contains both the changes to the current
> implementation and extensive jtreg tests.
> 
> While I've struggled to keep the code within the 80 chars/line limit,
> mercurial still generates longer lines. Thus, to avoid possible problems
> with the email systems, the code is submitted both inline and as an
> attachment. Hope at least one copy makes its way without errors.

Sounds good.

Thanks,

Brian

43 matches

Mail list logo