The .append(srepl) -> append(srep.value)

And I would simply remove the "fastpath" for the target.length() = 0 special 
case ...
And maybe pick an appropriate initial size (maybe this.length + 16 * diff) for 
the sb
to further reduce its internal expanding ...

Personally this code is straightforward and I would be happy with the 800+ vs 
1000+
score diff.

    public String replace(CharSequence target, CharSequence replacement) {
        String starget = target.toString();
        int targLen = starget.length();
        String srepl = replacement.toString();

        int j = indexOf(starget);
        // special case: nothing to replace
        if (j<  0) {
            return this;
        }

        final char[] value = this.value;
        StringBuilder sb = new StringBuilder();
        int i = 0;
        do {
            sb.append(value, i, j - i)
                .append(srepl.value);
            i = j + targLen;
        } while ((j = indexOf(starget, i))>  0);

        return sb.append(this, i, length()).toString();
    }

-Sherman


On 05/27/2015 10:06 AM, Ivan Gerasimov wrote:

On 27.05.2015 18:44, Xueming Shen wrote:
You might want to use directly sb.append.(char[], off, len), instead of 
append(str, off, len)/append(str).

Ah, yes sure!  It would improve things.
I updated the webrev in place:
http://cr.openjdk.java.net/~igerasim/8058779/03/webrev/

However the general picture is almost the same:
performance:
Baseline:      MyBenchmark.test  thrpt   40  257'051.948 ± 4537.484  ops/s
StringBuilder: MyBenchmark.test  thrpt   40  808'033.808 ± 22152.166  ops/s
Arrays:        MyBenchmark.test  thrpt   40  1'049'235.602 ± 15501.803  ops/s

Memory efficiency is still less then for array-version:
$ time ~/java9/jdk-build/bin/java -showversion -Xmx1g C
java version "1.9.0-internal"
Java(TM) SE Runtime Environment (build 
1.9.0-internal-igerasim_2015_05_27_14_47-b00)
Java HotSpot(TM) 64-Bit Server VM (build 
1.9.0-internal-igerasim_2015_05_27_14_47-b00, mixed mode)

25) 100663296

real    0m2.429s
user    0m2.112s
sys    0m0.792s

Sincerely yours,
Ivan

Btw, is the target.lenght==0 the popular use case/scenario? Just wonder why do 
we need a special case
here for it.

thanks,
-Sherman

On 5/27/15 8:29 AM, Ivan Gerasimov wrote:
For completeness, here's another webrev, which uses StringBuilder:
http://cr.openjdk.java.net/~igerasim/8058779/03/webrev/

Its performance is somewhere in between the current implementation and the 
array-based implementation:
MyBenchmark.test  thrpt   40  796'059.192 ± 12455.970  ops/s

Memory efficiency is less then the one of array-based implementation:
$ time ~/java9/jdk-build/bin/java -showversion -Xmx1g C
java version "1.9.0-internal"
Java(TM) SE Runtime Environment (build 
1.9.0-internal-igerasim_2015_05_27_14_47-b00)
Java HotSpot(TM) 64-Bit Server VM (build 
1.9.0-internal-igerasim_2015_05_27_14_47-b00, mixed mode)

25) 100663296

real    0m2.585s
user    0m1.875s
sys    0m0.887s

Sincerely yours,
Ivan

On 27.05.2015 2:38, Xueming Shen wrote:
Ivan,

It might be worth trying String.index + StringBuilder, instead of 
writing/handling everything yourself.
Yes, it inevitably adds an arraycopy at the end to convert the StrinbBuilder to 
String, but it might have
better balance between performance and code complexity. The regex is probably a 
little heavy for
literal string replacement, but StringBuilder should not be that bad ...

-Sherman

On 5/26/15 4:11 PM, Ivan Gerasimov wrote:
I updated the webrev:
http://cr.openjdk.java.net/~igerasim/8058779/02/webrev/

In the check at 2300-2301 and 2351-2352 I replaced MAX_ARRAY_SIZE with 
Integer.MAX_VALUE, which seems to be more accurate here.

And  I want to add that this proposed implementation is not only faster, but 
also more memory efficient.
The following simple stress-test shows that the proposed version is able to 
handle twice larger strings, comparing to the current implementation.

----------------------------------------
public class C {
    public static void main(String[] args) throws Throwable {
        String s = "string";
        for (int i = 1; i < Integer.MAX_VALUE; ++i) {
            try {
                s = s.replace("string", "stringstring");
            } catch (OutOfMemoryError o) {
                System.out.println(i + ") " + s.length());
                break;
            }
        }
    }
}
----------------------------------------

$ time ~/java9/jdk/bin/java -showversion -Xmx1g C
java version "1.9.0-ea"
Java(TM) SE Runtime Environment (build 1.9.0-ea-b63)
Java HotSpot(TM) 64-Bit Server VM (build 1.9.0-ea-b63, mixed mode)

25) 100663296

real    0m4.525s
user    0m4.402s
sys    0m1.189s

$ time ~/java9/jdk-build/bin/java -showversion -Xmx1g C
java version "1.9.0-internal"
Java(TM) SE Runtime Environment (build 
1.9.0-internal-igerasim_2015_05_23_19_25-b00)
Java HotSpot(TM) 64-Bit Server VM (build 
1.9.0-internal-igerasim_2015_05_23_19_25-b00, mixed mode)

26) 201326592

real    0m2.139s
user    0m1.960s
sys    0m0.461s

Sincerely yours,
Ivan

On 24.05.2015 23:17, Ivan Gerasimov wrote:
Hello everybody!

I know many people here like it when the performance is getting better.
It was suggested to make the literal variant of String.replace() faster.
Currently, this method is implemented as a few calls to regexp API, so that the 
whole implementation takes only two lines of code.

I've created two versions of the fix.
In the first one, we scan the string and store indices of the found substrings 
in an array.
Then, we allocate the precisely sized char array and fill it it.
The case with the empty target has to be handled separately.

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058779
WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/00/webrev/

The second variant is much less verbose, however it's less efficient too.
Here the StringJoiner is used as an intermediate storage.

WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/01/webrev/


Here are the micro-benchmark results (in a string of ~300 chars do ~15 
replacements).
0) Baseline
MyBenchmark.test  thrpt   40  257'051.948 ± 4537.484 ops/s

1) Heavy-duty +308%
MyBenchmark.test  thrpt   40  1'049'235.602 ± 15501.803 ops/s

2) StringJoiner +190%
MyBenchmark.test  thrpt   40  746'000.629 ± 15387.036 ops/s


Personally, I like my first variant better, even though it adds almost 300 
lines of code.
But I'd like to hear what people think of it.

Sincerely yours,
Ivan












Reply via email to