On Mon, 4 Aug 2025 23:57:28 GMT, Naoto Sato <[email protected]> wrote:
>> Enabling lenient minus sign matching when parsing numbers. In some locales,
>> e.g. Finnish, the default minus sign is the Unicode "Minus Sign" (U+2212),
>> which is not the "Hyphen Minus" (U+002D) that users type in from keyboard.
>> Thus the parsing of user input numbers may fail. This change utilizes CLDR's
>> `parseLenient` element for minus signs and loosely matches them with the
>> hyphen-minus so that user input numbers can parse. As this is a behavioral
>> change, a corresponding CSR has been drafted.
>
> Naoto Sato has updated the pull request incrementally with one additional
> commit since the last revision:
>
> refrects review comments
Latest changes look good to me. I think ignoring supplementary/normalization is
fine and it would have been excessive otherwise. I left some more minor
comments.
src/java.base/share/classes/java/text/CompactNumberFormat.java line 219:
> 217: * </pre></blockquote>
> 218: *
> 219: * @implNote The implementation follows the LDML specification to enable
> loose
Suggestion:
* @implNote The JDK Reference Implementation follows the LDML specification to
enable loose
src/java.base/share/classes/java/text/DecimalFormat.java line 421:
> 419: * returns a numerically greater value.
> 420: *
> 421: * @implNote The implementation follows the LDML specification to enable
> loose
Suggestion:
* @implNote The JDK Reference Implementation follows the LDML specification to
enable loose
src/java.base/share/classes/java/text/DecimalFormatSymbols.java line 719:
> 717:
> 718: /**
> 719: * {@return the lenient minus signs} Multiple lenient minus signs
Do we have an idea of when a given locale would not have access to the lenient
minus signs provided by parseLenient element? If the vast majority of times it
will, then it is fine. Otherwise IMO, `getLenientMinusSigns()` should probably
call out the fact that it may not always be a concatenation of multiple minus
_signs_, and it may be a single minus _sign_ (as assigned by `minusSignText`).
Since that detail is only apparent by digging around the code that assigns
`lenientMinusSigns`.
src/java.base/share/classes/java/text/DecimalFormatSymbols.java line 720:
> 718: /**
> 719: * {@return the lenient minus signs} Multiple lenient minus signs
> 720: * are concatenated to form the returned string.
The surrounding package private methods use since tags.
Suggestion:
* are concatenated to form the returned string.
* @since 26
src/java.base/share/classes/java/text/DecimalFormatSymbols.java line 852:
> 850:
> 851: // Lenient minus signs
> 852: lenientMinusSigns = numberElements.length < 14 ? minusSignText :
> numberElements[13];
BTW, if I remove this check and always assign to `numberElements[13]`, I do not
observe any failures in the java_text/Format suite. It would be nice to have an
idea of why this check is needed. (I understand it is following the same length
checks of monetarySeparator and monetaryGroupingSeparator.)
-------------
PR Review: https://git.openjdk.org/jdk/pull/26580#pullrequestreview-3086137609
PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2252827372
PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2254807654
PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2254847469
PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2254792967
PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2254846860