> Please consider this fix to ensure that going from `MessageFormat` to pattern 
> string via `toPattern()` and then back via `new MessageFormat()` results in a 
> format that is equivalent to the original.
> 
> The quoting and escaping rules for `MessageFormat` pattern strings are really 
> tricky. I admit not completely understanding them. At a high level, they work 
> like this: The normal way one would "nest" strings containing special 
> characters is with straightforward recursive escaping like with the `bash` 
> command line. For example, if you want to echo `a "quoted string" example` 
> then you enter `echo "a "quoted string" example"`. With this scheme it's 
> always the "outer" layer's job to (un)escape special characters as needed. 
> That is, the echo command never sees the backslash characters.
> 
> In contrast, with `MessageFormat` and friends, nested subformat pattern 
> strings are always provided "pre-escaped". So to build an "outer" string 
> (e.g., for `ChoiceFormat`) the "inner" subformat pattern strings are more or 
> less just concatenated, and then only the `ChoiceFormat` option separator 
> characters (e.g., `<`, `#`, `|`, etc.) are escaped.
> 
> The "pre-escape" escaping algorithm escapes `{` characters, because `{` 
> indicates the beginning of a format argument. However, it doesn't escape `}` 
> characters. This is OK because the format string parser treats any "extra" 
> closing braces (where "extra" means not matching an opening brace) as plain 
> characters.
> 
> So far, so good... at least, until a format string containing an extra 
> closing brace is nested inside a larger format string, where the extra 
> closing brace, which was previously "extra", can now suddenly match an 
> opening brace in the outer pattern containing it, thus truncating it by 
> "stealing" the match from some subsequent closing brace.
> 
> An example is the `MessageFormat` string `"{0,choice,0.0#option A: 
> {1}|1.0#option B: {1}'}'}"`. Note the second option format string has a 
> trailing closing brace in plain text. If you create a `MessageFormat` with 
> this string, you see a trailing `}` only with the second option.
> 
> However, if you then invoke `toPattern()`, the result is 
> `"{0,choice,0.0#option A: {1}|1.0#option B: {1}}}"`. Oops, now because the 
> "extra" closing brace is no longer quoted, it matches the opening brace at 
> the beginning of the string, and the following closing  brace, which was the 
> previous match, is now just plain text in the outer `MessageFormat` string.
> 
> As a result, invoking `f.format(new Object{} { 0, 5 })` will retur...

Archie Cobbs has updated the pull request incrementally with one additional 
commit since the last revision:

  Fix misspelling in comment.

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/17416/files
  - new: https://git.openjdk.org/jdk/pull/17416/files/a4d82c67..76e3d6ef

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=17416&range=06
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17416&range=05-06

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/17416.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/17416/head:pull/17416

PR: https://git.openjdk.org/jdk/pull/17416

Reply via email to