On Sat, 29 Jun 2024 11:53:31 GMT, Shaojin Wen <d...@openjdk.org> wrote:

> We need a String format solution with good performance. String Template was 
> once expected, but it has been removed. j.u.Formatter is powerful, but its 
> performance is not good enough.
> 
> This PR implements a subset of j.u.Formatter capabilities. The performance is 
> good enough that it is a fastpath for commonly used functions. When the 
> supported functions are exceeded, it will fall back to using j.u.Formatter.
> 
> The performance of this implementation is good enough, the fastpath has low 
> detection cost, There is no noticeable performance degradation when falling 
> back to j.u.Formatter via fastpath.
> 
> Below is a comparison of String.format and concat-based and StringBuilder:
> 
> * benchmark java code
> 
> public class StringFormat {
>     @Benchmark
>     public String stringIntFormat() {
>         return "%s %d".formatted(s, i);
>     }
> 
>     @Benchmark
>     public String stringIntConcat() {
>         return s + " " + i;
>     }
> 
>     @Benchmark
>     public String stringIntStringBuilder() {
>         return new StringBuilder(s).append(" ").append(i).toString();
>     }
> }
> 
> 
> * benchmark number on macbook m1 pro
> 
> Benchmark                            Mode  Cnt   Score   Error  Units
> StringFormat.stringIntConcat         avgt   15   6.541 ? 0.056  ns/op
> StringFormat.stringIntFormat         avgt   15  17.399 ? 0.133  ns/op
> StringFormat.stringIntStringBuilder  avgt   15   8.004 ? 0.063  ns/op
> 
> 
> From the above data, we can see that the implementation of fastpath reduces 
> the performance difference between String.format and StringBuilder from 10 
> times to 2~3 times.
> 
> The implementation of fastpath supports the following four specifiers, which 
> can appear at most twice and support a width of 1 to 9.
> 
> d
> x
> X
> s
> 
> If necessary, we can add a few more.
> 
> 
> Below is a comparison of performance numbers running on a MacBook M1, showing 
> a significant performance improvement.
> 
> -Benchmark                          Mode  Cnt    Score    Error  Units 
> (baseline)
> -StringFormat.complexFormat         avgt   15  895.954 ? 52.541  ns/op
> -StringFormat.decimalFormat         avgt   15  277.420 ? 18.254  ns/op
> -StringFormat.stringFormat          avgt   15   66.787 ?  2.715  ns/op
> -StringFormat.stringIntFormat       avgt   15   81.046 ?  1.879  ns/op
> -StringFormat.widthStringFormat     avgt   15   38.897 ?  0.114  ns/op
> -StringFormat.widthStringIntFormat  avgt   15  109.841 ?  1.028  ns/op
> 
> +Benchmark                          Mode  Cnt    Score    Error  Units 
> (current f925339e93fdf7a281462554ce5d73139bd0f0cd)
> +StringFormat.complexF...

src/java.base/share/classes/java/lang/StringFormat.java line 65:

> 63:         int max = format.length();
> 64:         int off = format.indexOf('%');
> 65:         if (off == -1 || off + 1 == max) {

If there is no `%` in the format string, should this just return the format 
string as result?

Suggestion:

        int off = format.indexOf('%');
        if (off == -1) {
            // no formatting to be done
            return format;
        }

        int max = format.length();
        if (off + 1 == max) {

src/java.base/share/classes/java/lang/StringFormat.java line 82:

> 80:                 conv = format.charAt(off + 2);
> 81:             }
> 82:         }

is it worth handling `width > 9`?

Suggestion:

        for (int i = 2; conv >= '1' && conv <= '9' && off + i < max; i++) {
            width = (width * 10) + (conv - '0');
            if (off + i < max) {
                conv = format.charAt(off + i);
            }
        }

src/java.base/share/classes/java/lang/StringFormat.java line 90:

> 88:         if (conv == STRING) {
> 89:             arg = String.valueOf(arg);
> 90:         }

should these conditions be combined?

Suggestion:

        if (conv == STRING) {
            if (isBigInt(arg)) {
                conv = DECIMAL_INTEGER;
            } else {
                arg = String.valueOf(arg);
            }
        }

src/java.base/share/classes/java/lang/StringFormat.java line 102:

> 100:         int max = format.length();
> 101:         int off0 = format.indexOf('%');
> 102:         if (off0 == -1 || off0 + 1 == max) {

similar to above, if `%` is not found, return `format`

Suggestion:

        int off0 = format.indexOf('%');
        if (off0 == -1) {
            // no formatting to be done
            return format;
        }

        int max = format.length();
        if (off0 + 1 == max) {

src/java.base/share/classes/java/lang/StringFormat.java line 110:

> 108:             width0 = conv0 - '0';
> 109:             if (off0 + 2 < max) {
> 110:                 conv0 = format.charAt(off0 + 2);

is it worth handling `width0 > 9`?

Suggestion:

        for (int i = 2; conv0 >= '1' && conv0 <= '9' && off0 + i < max; i++) {
            width0 = (width0 * 10) + (conv0 - '0');
            if (off0 + i < max) {
                conv0 = format.charAt(off0 + i);

src/java.base/share/classes/java/lang/StringFormat.java line 123:

> 121:             width1 = conv1 - '0';
> 122:             if (off1 + 2 < max) {
> 123:                 conv1 = format.charAt(off1 + 2);

is it worth handling `width1 > 9`?

Suggestion:

        for (int i = 2; conv1 >= '1' && conv1 <= '9' && off1 + i < max; i++) {
            width1 = (width1 * 10) + (conv1 - '0');
            if (off1 + i < max) {
                conv1 = format.charAt(off1 + i);

src/java.base/share/classes/java/lang/StringFormat.java line 151:

> 149:             coder |= str.coder();
> 150:             arg1 = String.valueOf(str);
> 151:         }

similar to above, should these conditions be consolidated?

Suggestion:

        String str;
        byte coder = format.coder();
        if (conv0 == STRING) {
            if (isBigInt(arg0)) {
                conv0 = DECIMAL_INTEGER;
            } else {
                str = String.valueOf(arg0);
                coder |= str.coder();
                arg0 = str;
            }
        }

        if (conv1 == STRING) {
            if (isBigInt(arg1)) {
                conv1 = DECIMAL_INTEGER;
            } else {
                str = String.valueOf(arg1);
                coder |= str.coder();
                arg1 = String.valueOf(str);
            }
        }

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659880719
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659888420
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659881435
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659882380
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659888524
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659888863
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659882487

Reply via email to