Re: RFR: JDK-8263261 Extend String::translateEscapes to support unicode escapes [v12]

2024-01-26 Thread Jim Laskey
On Fri, 26 Jan 2024 16:54:14 GMT, Roger Riggs  wrote:

>> Jim Laskey has updated the pull request with a new target base due to a 
>> merge or a rebase. The incremental webrev excludes the unrelated changes 
>> brought in by the merge/rebase. The pull request contains 12 additional 
>> commits since the last revision:
>> 
>>  - Merge remote-tracking branch 'upstream/master' into 8263261
>>  - Update unicode to Unicode
>>  - Requested changes
>>  - Update String.java
>>  - Requested changes
>>  - Update Copyright
>>  - Update copyright year of test
>>  - Add JLS Unicode Escapes reference
>>  - Update comment
>>  - Update copyright year
>>  - ... and 2 more: https://git.openjdk.org/jdk/compare/af9bfd62...040bda82
>
> src/java.base/share/classes/java/lang/String.java line 4229:
> 
>> 4227:  * {@code \u005Cu...u}
>> 4228:  * Unicode escape
>> 4229:  * single UTF-16 code unit equivalent
> 
> The `...` makes it less clear what is being shown.  It might be clearer to 
> include the  in the resulting value and drop the multiple `u` case.

Changed

> src/java.base/share/classes/java/lang/String.java line 4245:
> 
>> 4243:  * escape sequences and Unicode escapes are translated as 
>> encountered in one pass and
>> 4244:  * not done as an Unicode escapes pass followed 
>> by an escape sequences
>> 4245:  * pass.
> 
> I would move the description of the compiler behavior to the end and remove 
> "also". For example, 
> Suggestion:
> 
>  * @implNote As a convenience for use with constructed
>  * strings, this method translates Unicode escapes. For example, this
>  * method could be used when ASCII encoded text files need to maintain 
> Unicode
>  * content. The translation is done in a single pass and is 
> non-recursive. That is,
>  * escape sequences and Unicode escapes are translated as encountered in 
> one pass and
>  * not done as an Unicode escapes pass followed by an 
> escape sequences
>  * pass. By comparison, the compiler translates all Unicode escapes 
> before string
>  * literals are translated.

Changed

> test/jdk/java/lang/String/TranslateEscapes.java line 97:
> 
>> 95: verifyUnicodeEscape("\\u2022", "\u2022");
>> 96: verifyUnicodeEscape("\\ud83c\\udf09", "\ud83c\udf09");
>> 97: verifyUnicodeEscape("\\u2022", "\u2022");
> 
> Include the code from the example as a test case too.

None present. Was a mis-paste.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/17491#discussion_r1467926349
PR Review Comment: https://git.openjdk.org/jdk/pull/17491#discussion_r1467926483
PR Review Comment: https://git.openjdk.org/jdk/pull/17491#discussion_r1467929023


Re: RFR: JDK-8263261 Extend String::translateEscapes to support unicode escapes [v12]

2024-01-26 Thread Roger Riggs
On Fri, 26 Jan 2024 15:06:50 GMT, Jim Laskey  wrote:

>> Currently String::translateEscapes does not support unicode escapes, 
>> reported as a IllegalArgumentException("Invalid escape sequence: ..."). 
>> String::translateEscapes should translate unicode escape sequences to 
>> provide full coverage,
>
> Jim Laskey has updated the pull request with a new target base due to a merge 
> or a rebase. The incremental webrev excludes the unrelated changes brought in 
> by the merge/rebase. The pull request contains 12 additional commits since 
> the last revision:
> 
>  - Merge remote-tracking branch 'upstream/master' into 8263261
>  - Update unicode to Unicode
>  - Requested changes
>  - Update String.java
>  - Requested changes
>  - Update Copyright
>  - Update copyright year of test
>  - Add JLS Unicode Escapes reference
>  - Update comment
>  - Update copyright year
>  - ... and 2 more: https://git.openjdk.org/jdk/compare/b94b04ff...040bda82

src/java.base/share/classes/java/lang/String.java line 4229:

> 4227:  * {@code \u005Cu...u}
> 4228:  * Unicode escape
> 4229:  * single UTF-16 code unit equivalent

The `...` makes it less clear what is being shown.  It might be clearer to 
include the  in the resulting value and drop the multiple `u` case.

src/java.base/share/classes/java/lang/String.java line 4245:

> 4243:  * escape sequences and Unicode escapes are translated as 
> encountered in one pass and
> 4244:  * not done as an Unicode escapes pass followed by 
> an escape sequences
> 4245:  * pass.

I would move the description of the compiler behavior to the end and remove 
"also". For example, 
Suggestion:

 * @implNote As a convenience for use with constructed
 * strings, this method translates Unicode escapes. For example, this
 * method could be used when ASCII encoded text files need to maintain 
Unicode
 * content. The translation is done in a single pass and is non-recursive. 
That is,
 * escape sequences and Unicode escapes are translated as encountered in 
one pass and
 * not done as an Unicode escapes pass followed by an 
escape sequences
 * pass. By comparison, the compiler translates all Unicode escapes before 
string
 * literals are translated.

test/jdk/java/lang/String/TranslateEscapes.java line 97:

> 95: verifyUnicodeEscape("\\u2022", "\u2022");
> 96: verifyUnicodeEscape("\\ud83c\\udf09", "\ud83c\udf09");
> 97: verifyUnicodeEscape("\\u2022", "\u2022");

Include the code from the example as a test case too.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/17491#discussion_r1467892757
PR Review Comment: https://git.openjdk.org/jdk/pull/17491#discussion_r1467895901
PR Review Comment: https://git.openjdk.org/jdk/pull/17491#discussion_r1467900516


Re: RFR: JDK-8263261 Extend String::translateEscapes to support unicode escapes [v12]

2024-01-26 Thread Jim Laskey
> Currently String::translateEscapes does not support unicode escapes, reported 
> as a IllegalArgumentException("Invalid escape sequence: ..."). 
> String::translateEscapes should translate unicode escape sequences to provide 
> full coverage,

Jim Laskey has updated the pull request with a new target base due to a merge 
or a rebase. The incremental webrev excludes the unrelated changes brought in 
by the merge/rebase. The pull request contains 12 additional commits since the 
last revision:

 - Merge remote-tracking branch 'upstream/master' into 8263261
 - Update unicode to Unicode
 - Requested changes
 - Update String.java
 - Requested changes
 - Update Copyright
 - Update copyright year of test
 - Add JLS Unicode Escapes reference
 - Update comment
 - Update copyright year
 - ... and 2 more: https://git.openjdk.org/jdk/compare/eea3c2d9...040bda82

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/17491/files
  - new: https://git.openjdk.org/jdk/pull/17491/files/b57a551d..040bda82

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=17491=11
 - incr: https://webrevs.openjdk.org/?repo=jdk=17491=10-11

  Stats: 8261 lines in 340 files changed: 5165 ins; 1908 del; 1188 mod
  Patch: https://git.openjdk.org/jdk/pull/17491.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/17491/head:pull/17491

PR: https://git.openjdk.org/jdk/pull/17491