Hi, Sharing in case you haven't seen it... Aleksey Shipilëv has a talk about String optimizations, which discusses these questions and specifically the Boyer-Moore algorithm.
http://shipilev.net/talks/joker-Oct2014-string-catechism.pdf Page 85 talks about the intrinsification of indexOf and compares performance of the builtin vs intrinsified versions by using: -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=::_indexOf There might also be a video recording somewhere. Jonathan On Mon, Jan 5, 2015 at 2:31 PM, Martin Buchholz <marti...@google.com> wrote: > Evidence that hotspot tries to intrinsify String.indexOf: > > do_intrinsic(_indexOf, > java_lang_String, indexOf_name, string_int_signature, > F_R) \ > do_name( indexOf_name, "indexOf") > \ > > So work would have to be done at the hotspot intrinsics level (not easy!) > > Also, the problem is that Boyer-Moore is a fundamental improvement in > string searching, but its overhead is high enough that it's unlikely to > help with typical input strings found in the Real World. I think we would > want to split into two implementations and only do Boyer-Moore if it looks > profitable. Similarly for j.u.r.Pattern's regex compiler. > > I still think it's sufficiently difficult that effort is best applied > elsewhere. > > > On Mon, Jan 5, 2015 at 10:59 AM, Zoltan Sziladi <kisszi...@gmail.com> > wrote: > > > Hi, > > > > This discussion was a long time ago, I was just reading through it to > > check again what was the last state of the discussion about the > > String.indexOf. > > There is one part which I still do not understand, hopefully someone > could > > shed some light on it. A few emails ago Martin mentioned > > > > "Hotspot seems to have some intrinsification of String.indexOf, which > > confuses me. > > Hotspot seems the right place to provide more optimizations for this, > > since there has been a fair amount of work creating high-performance > > low-level implementations of this idea in C." > > > > Then Ivan asked what that actually meant, whether hotspot actually > > replaced the jdk implementation with a low level optimized C > > implementation, but I never saw an answer to that. > > > > Can someone please explain this? If we somehow found an algorithm that > > beat the naive implementation in the average case, would it be possible > to > > just implement it in the JDK? Also, is there a test set which we could > > consider conclusive enough to actually change the implementation based on > > results from that? (For example if I create an implementation that beats > > the naive algorithm in those testcases, then we could consider it faster > in > > average case) > > > > Thanks! > > > > Zoltan > > > > On Fri, Apr 18, 2014 at 9:43 AM, Martin Buchholz <marti...@google.com> > > wrote: > > > >> I remain skeptical that modifying the implementation of static package > >> private String.indexOf is the right approach. > >> > >> If we can produce high-performance implementations of these, perhaps > >> making > >> them public in Arrays.java is the right way. > >> > >> 1766 if (targetCount == 1) {1767 return (i > >> <= max) ? i - sourceOffset : -1; > >> > >> > >> If you're going to special case targetCount == 1, you shouldn't have a > >> test > >> for it in the main loop, since you slow down the general case. Instead, > >> you can sequester the special cases like this: > >> > >> if (targetCount <= 1) { > >> if (targetCount == 0) ... > >> else ... > >> } > >> > >> // now assume targetCount >= 2 > >> > >> > >> On Fri, Apr 18, 2014 at 1:32 AM, Ivan Gerasimov > >> <ivan.gerasi...@oracle.com>wrote: > >> > >> > > >> > On 16.04.2014 2:53, Martin Buchholz wrote: > >> > > >> > Hi Ivan, > >> > > >> > There's already an indexOf(int, int) that allows a user to explicitly > >> > search for a char (or character). > >> > > >> > Sure. > >> > What I meant was not to introduce a special function for searching a > >> > single character, but to restrict the general case loop to search for > a > >> > substring of at least 2 characters. > >> > Having this condition hold, we can enter the loop only when two > starting > >> > characters match, and this can save us a few nanoseconds in many > cases. > >> > > >> > > >> > Hotspot seems to have some intrinsification of String.indexOf, which > >> > confuses me. > >> > > >> > > >> > Does it mean that that the java implementation of indexOf is never > >> > compiled? > >> > When hotspot replaces the jdk implementation with its own one? > >> > Is it ever worth to try to optimize the java implementation? > >> > > >> > > >> > Hotspot seems the right place to provide more optimizations for this, > >> > since there has been a fair amount of work creating high-performance > >> > low-level implementations of this idea in C. > >> > > >> > The hotspot's intrinsic is already optimized for searching substrings > >> of > >> > different length. > >> > > >> > Sincerely yours, > >> > Ivan > >> > > >> > > >> > > >> > On Tue, Apr 15, 2014 at 10:13 AM, Ivan Gerasimov < > >> > ivan.gerasi...@oracle.com> wrote: > >> > > >> >> Hi everyone! > >> >> > >> >> > >> >> On 04.04.2014 21:13, Martin Buchholz wrote: > >> >> > >> >> Summary: > >> >> > >> >> Many people (myself included) have looked at this problem. It's > >> unlikely > >> >> that String.indexOf will change. It's hard to beat the naive > >> >> implementation in the typical case. > >> >> > >> >> But we can try to speed up this naive implementation a little bit. > >> >> > >> >> We can separate the special case: When the substring's length == 1. > >> >> This special case can be done fast, and in the general case we can > now > >> >> assume substring's length is at least 2. > >> >> > >> >> Here's the webrev with the implementation of this idea: > >> >> http://cr.openjdk.java.net/~igerasim/indexof/0/webrev/ > >> >> > >> >> I've done some benchmarking with JMH and found no performance > >> degradation > >> >> on my machine. > >> >> Of course, more testcases should be created and they should be tried > on > >> >> different machines to be treated as reliable. > >> >> > >> >> Benchmark Mode Thr Cnt Sec > Mean > >> >> Mean error Units > >> >> o.b.IndexOfBench.benchIndexOf_1_A avgt 1 20 5 > >> >> 704.739 9.104 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_1_B avgt 1 20 5 * > >> >> 573.879* 9.820 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_2_A avgt 1 20 5 > >> >> 668.455 9.882 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_2_B avgt 1 20 5 > >> >> *476.062* 6.063 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_3_A avgt 1 20 5 > >> >> 155.227 1.796 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_3_B avgt 1 20 5 * > >> >> 152.850 * 1.512 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_4_A avgt 1 20 5 > >> >> 656.183 5.904 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_4_B avgt 1 20 5 > >> >> *515.178* 9.107 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_5_A avgt 1 20 5 > >> >> 140.609 7.305 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_5_B avgt 1 20 5 > >> >> *129.603* 1.654 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_6_A avgt 1 20 5 > >> >> 127.713 1.497 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_6_B avgt 1 20 5 > >> >> *122.177* 1.186 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_7_A avgt 1 20 5 > >> >> 430.148 4.875 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_7_B avgt 1 20 5 * > >> >> 387.338* 10.904 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_8_A avgt 1 20 5 > >> >> 2064.563 28.885 nsec/op > >> >> o.b.IndexOfBench.benchIndexOf_8_B avgt 1 20 5 > >> >> *1858.669* 24.343 nsec/op > >> > >> >> > >> >> Benchmarks ending with A use the current indexOf implementation, > with B > >> >> use the modified version. > >> >> These numbers show from 1.4% up to 28% performance increase. > >> >> > >> >> The full listing is below. > >> >> > >> >> Suggestions about what else to test are welcome! > >> >> > >> >> Sincerely yours, > >> >> Ivan > >> >> > >> >> --------------------- > >> >> > >> >> /** > >> >> * Copyright (c) 2014, Oracle and/or its affiliates. All rights > >> reserved. > >> >> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. > >> >> * > >> >> * This code is free software; you can redistribute it and/or modify > it > >> >> * under the terms of the GNU General Public License version 2 only, > as > >> >> * published by the Free Software Foundation. Oracle designates this > >> >> * particular file as subject to the "Classpath" exception as > provided > >> >> * by Oracle in the LICENSE file that accompanied this code. > >> >> * > >> >> * This code is distributed in the hope that it will be useful, but > >> >> WITHOUT > >> >> * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY > >> or > >> >> * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public > >> License > >> >> * version 2 for more details (a copy is included in the LICENSE file > >> that > >> >> * accompanied this code). > >> >> * > >> >> * You should have received a copy of the GNU General Public License > >> >> version > >> >> * 2 along with this work; if not, write to the Free Software > >> Foundation, > >> >> * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. > >> >> * > >> >> * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA > 94065 > >> USA > >> >> * or visit www.oracle.com if you need additional information or > have > >> any > >> >> * questions. > >> >> */ > >> >> package org.benches; > >> >> > >> >> import org.openjdk.jmh.annotations.*; > >> >> import org.openjdk.jmh.logic.BlackHole; > >> >> import java.util.concurrent.TimeUnit; > >> >> > >> >> @BenchmarkMode(Mode.AverageTime) > >> >> @OutputTimeUnit(TimeUnit.NANOSECONDS) > >> >> @State(Scope.Benchmark) > >> >> public class IndexOfBench { > >> >> > >> >> > >> >> // > >> >> ||| > >> >> final static char[] source1 = > >> >> > >> > "abababcabcacabcabcabcabcaccbcabcacbcabcacbcbcabcbcbacbcabcbabacbcbacbcabcabcabcabcabcabcabcacbacbacbacabcabcacbacbcabcbcbcaabdacbacabcabacbacabca".toCharArray(); > >> >> final static char[] source2 = > >> >> > >> > "acfacafacfacfacfacafcafcacadcacdacaccacacdacacfcafcafcfacdacadcadcadcdacfacfacdacadcacdcfacfacdacdacdcfacdacdacdacgshgshasdabdahghjgwdshacavcavsc".toCharArray(); > >> >> final static char[] source3 = > >> >> > >> > "tyrtytfytytuytfytuytggfghgdytyftytfdytdshfgjhdfytsfuythgsfhgjhfghtuysdfthgfsdhgystfjhgsfguysthgfgjhgdfjhgsjdghfythgsdfjhggfabduikjhfjhkjhfkjhgkjh".toCharArray(); > >> >> final static char[] target1 = "abd".toCharArray(); > >> >> > >> >> final static char[] source4 = > >> >> > >> > "ahhhahahahahhahahahahhahahahhhahahahahahahahahahahhahahahahahahahahallallalalalalalkakakakakakakakakkakmamamamabammamamamamamaakaklalalaoalalalao".toCharArray(); > >> >> final static char[] source5 = > >> >> > >> > "hgjkhjhjkdghkjhdfkjhgkjhdkjdhgkjdfhgkjdhfgkjdfhgkjhdfkjghkdjghkdjfhgkjhkdjhgkjdfhjkghkdjfhgkjdfhgkjdfhgkjdfhkgabhfkjghdkfjhgkjdfhgkjdfhgkjdfhgkhh".toCharArray(); > >> >> final static char[] target2 = "ab".toCharArray(); > >> >> > >> >> final static char[] source6 = > >> >> > >> > "lskgjsklfjgskldfjgklsfjdlgkjsdflkgjskldfgjsdklfgjsl;kdfgjskldfjglksdfjglksfjglksdfjgklsfdjgslkdfjglksjdflkgsjfalksjdflkfgjsdklfjglskdfjglksdfjghh".toCharArray(); > >> >> final static char[] target3 = "a".toCharArray(); > >> >> > >> >> final static char[] source7 = > >> >> > >> > "lskgajabfagskldfjgklsabclgkjsdflkabsabcdgjsdklfabclbkdfgjskabfjglksdfjabcdfjglabcfjgklsfabgslkdfjglksjdabcdsjfabcdedflabcjsdklfjglskdfabcksdfjghh".toCharArray(); > >> >> final static char[] target4 = "abcde".toCharArray(); > >> >> > >> >> final static char[] source8 = > >> >> > >> > "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa".toCharArray(); > >> >> final static char[] target5 = "aaaab".toCharArray(); > >> >> > >> >> @GenerateMicroBenchmark > >> >> public void benchIndexOf_1_A(BlackHole bh) { > >> >> bh.consume(indexOfA(source1, 0, source1.length, target1, 0, > >> >> target1.length, 0)); > >> >> } > >> >> > >> >> @GenerateMicroBenchmark > >> >> public void benchIndexOf_1_B(BlackHole bh) { > >> >> bh.consume(indexOfB(source1, 0, source1.length, target1, 0, > >> >> target1.length, 0)); > >> >> } > >> >> > >> >> @GenerateMicroBenchmark > >> >> public void benchIndexOf_2_A(BlackHole bh) { > >> >> bh.consume(indexOfA(source2, 0, source2.length, target1, 0, > >> >> target1.length, 0)); > >> >> } > >> >> > >> >> @GenerateMicroBenchmark > >> >> public void benchIndexOf_2_B(BlackHole bh) { > >> >> bh.consume(indexOfB(source2, 0, source2.length, target1, 0, > >> >> target1.length, 0)); > >> >> } > >> >> > >> >> @GenerateMicroBenchm > >> >> > >> > > >> > > >> > > > > >