On Mon, 1 Jun 2026 21:34:11 GMT, Naoto Sato <[email protected]> wrote:

> Updated `ListFormat` custom pattern parsing to avoid regex matching. The 
> implementation now locates placeholders and delimiters with literal string 
> operations.
> 
> Added coverage for invalid long custom patterns and for regex metacharacters 
> in custom patterns, which should be treated as literals.
> 
> ---------
> - [x] I confirm that I make this contribution in accordance with the [OpenJDK 
> Interim AI Policy](https://openjdk.org/legal/ai).

src/java.base/share/classes/java/text/ListFormat.java line 114:

> 112:     private static final int THREE = 4;
> 113:     private static final int PATTERN_ARRAY_LENGTH = THREE + 1;
> 114:     private static final int PLACEHOLDER_LENGTH = 3;

Nit: Would not be opposed to `PLACEHOLDER_TOKEN_LENGTH` here. To me, it makes 
it clearer that this is the length of the placeholder token text, not an 
arbitrary placeholder value.

src/java.base/share/classes/java/text/ListFormat.java line 475:

> 473:             }
> 474: 
> 475:             if (startPattern[1] <= endPattern[0]) {

For readability, keeping the `startEnd` and `endStart` vars might be preferred 
to calling `startPattern[1]` and  `endPattern[0]` directly.

src/java.base/share/classes/java/text/ListFormat.java line 581:

> 579:         var sb = new StringBuilder(256).append(patterns[START]);
> 580:         IntStream.range(2, count - 1).forEach(i -> 
> sb.append(middleBetween).append("{").append(i).append("}"));
> 581:         sb.append(patterns[END].replaceFirst("\\{0}", 
> "").replaceFirst("\\{1}", "\\{" + (count - 1) + "\\}"));

Since we now compute and store `endBetween` and `endAfter` in `init()`, can we 
reuse them here instead of using regex to strip out the placeholders? i.e.,


sb.append(endBetween)
    .append("{").append(count - 1).append("}")
    .append(endAfter);

src/java.base/share/classes/java/text/ListFormat.java line 668:

> 666:      * @param pattern pattern string to parse
> 667:      */
> 668:     private static int[] findPlaceholders(String pattern) {

Extra placeholder token behavior looks a little spotty (and was before as well, 
I believe). For example, given a `START` pattern, a duplicate "{1}" placeholder 
gets rejected because of the `placeholderPositions[1] + PLACEHOLDER_LENGTH == 
pattern.length()` check in `init()`. However, a duplicate "{0}" goes right 
through, for example "{0}, {0} {1}". I don't see any test cases or 
specification that hints at what to expect for that. Is this worth 
standardizing or documenting in a separate issue?

test/jdk/java/text/Format/ListFormat/TestListFormat.java line 273:

> 271:             .getMessage();
> 272: 
> 273:         var expected = "%s is incorrect:".formatted(

I think it would be cleaner to parameterize the error messages with the 
corresponding `index` as an `Argument`, rather than calculating it in the test 
case.

test/jdk/java/text/Format/ListFormat/TestListFormat.java line 282:

> 280:                 default -> throw new RuntimeException("invalid index");
> 281:             });
> 282:         assertEquals(expected, msg.substring(0, Math.min(msg.length(), 
> expected.length())));

The `Math.min` tripped me up here. It looks to me that `expected` is always 
going to be shorter than `msg`, since `msg` includes the faulty pattern and 
`expected` is just the base error message. Should we just check that `msg` 
starts with `expected` using `startsWith` instead?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/31343#discussion_r3339087762
PR Review Comment: https://git.openjdk.org/jdk/pull/31343#discussion_r3339227040
PR Review Comment: https://git.openjdk.org/jdk/pull/31343#discussion_r3339061666
PR Review Comment: https://git.openjdk.org/jdk/pull/31343#discussion_r3339186316
PR Review Comment: https://git.openjdk.org/jdk/pull/31343#discussion_r3338897312
PR Review Comment: https://git.openjdk.org/jdk/pull/31343#discussion_r3338929985

Reply via email to