I still think it would be valuable to land a patch for replaceAll to avoid temporary StringBuilders, is there anyone who wants to help me land it?
Isaac On Sun, Apr 29, 2018 at 10:24 PM, Isaac Levy <isaac.r.l...@gmail.com> wrote: > Your patch looks good to me, and will get the majority of performance > benefits without any API issues. > > Isaac > > > On Tue, Apr 24, 2018 at 9:38 PM, Xueming Shen <xueming.s...@oracle.com> wrote: >> for String based replaceAll/First() it might be worth doing something like >> >> http://cr.openjdk.java.net/~sherman/regex_removesb/webrev/ >> >> >> On 4/24/18, 10:47 AM, Isaac Levy wrote: >>> >>> Hi Sherman, >>> >>> Thanks for clarifying. Looks like exceptions are caused by invalid >>> format string. I wouldn't expect most programs to be catching this and >>> preserving their buffer, but dunno. >>> >>> How much does it affect perf? Well it depends on use case, a jmh of >>> replaceAll with a length 200 string of digits and regex "(\w)" shows >>> about 30% speedup. >>> >>> [info] Benchmark Mode Cnt Score Error Units >>> [info] origM avgt 10 11.669 ± 0.211 us/op >>> [info] newM avgt 10 8.926 ± 0.095 us/op >>> >>> Isaac >>> >>> >>> On Tue, Apr 24, 2018 at 12:53 PM, Xueming Shen<xueming.s...@oracle.com> >>> wrote: >>>> >>>> Hi Isaac, >>>> >>>> I actually meant to say "we are not supposed to output the partial text >>>> into >>>> the output buffer in case of an exception". It has nothing to do with the >>>> changeset you cited. This has been the behavior since day one/JDK1.4, >>>> though it is not specified explicitly in the API doc. The newly added >>>> StringBuilder variant simply follows this behavior. If it's really >>>> desired >>>> it >>>> is kinda doable to save that StringBuilder without the "incompatible" >>>> behavior >>>> change but just wonder if it is really worth the effort. >>>> >>>> Thanks, >>>> Sherman >>>> >>>> >>>> On 4/24/18, 9:11 AM, Isaac Levy wrote: >>>>> >>>>> (moving this to a separate discussion) >>>>> >>>>> >>>>> --- a/src/java.base/share/classes/java/util/regex/Matcher.java >>>>> +++ b/src/java.base/share/classes/java/util/regex/Matcher.java >>>>> @@ -993,13 +993,11 @@ >>>>> public Matcher appendReplacement(StringBuilder sb, String >>>>> replacement) { >>>>> // If no match, return error >>>>> if (first< 0) >>>>> throw new IllegalStateException("No match available"); >>>>> - StringBuilder result = new StringBuilder(); >>>>> - appendExpandedReplacement(replacement, result); >>>>> // Append the intervening text >>>>> sb.append(text, lastAppendPosition, first); >>>>> // Append the match substitution >>>>> + appendExpandedReplacement(replacement, sb); >>>>> - sb.append(result); >>>>> >>>>> >>>>> >>>>> On Mon, Apr 23, 2018 at 5:05 PM Xueming Shen<xueming.s...@oracle.com> >>>>> wrote: >>>>>> >>>>>> >>>>>> I would assume in case of an exception thrown from >>>>>> appendExpandedReplacement() we don't >>>>>> want "text" to be pushed into the "sb". >>>>>> >>>>>> -sherman >>>>> >>>>> >>>>> Perhaps. Though the behavior under exception is undefined and this >>>>> function is probably primarily used though the replaceAll API, which >>>>> wouldn’t return the intermediate sb under the case of exception. >>>>> >>>>> My reading of the blame was the temp StringBuilder was an artifact of >>>>> the api previously using StringBuffer externally. See >>>>> http://hg.openjdk.java.net/jdk9/dev/jdk/rev/763c564451b3 >>>> >>>> >>