Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Aleksey, > http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.05/ src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java 669 final boolean sized; 670 final boolean exact; The fields can be declared private. test/java/lang/String/concat/ImplicitStringConcat.java 119 static StringBuffer sb = new StringBuffer(); 120 static { 121 sb.append('a'); 122 } This can be rewritten as: static StringBuffer sb = new StringBuffer().append('a'); or even as: static StringBuffer sb = new StringBuffer("a"); The result should be the same. But it is easier to read. Otherwise looks good. And I think you need now a real reviewer. :-) Best regards, Andrej Golovnin
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Aleksey, > Yes, thanks for more polishing, I have uploaded the updates to: > http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.06/ Maybe one more: 1342 Class[] cls = new Class[cnt]; 1343 for (int i = 0; i < cnt; i++) { 1344 cls[i] = int.class; 1345 } The for-loop can be replaced by: Arrays.fill(cls, int.class); Best regards, Andrej Golovnin
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Maurizio, Updated webrevs: http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.04/ http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.05/ On 11/27/2015 10:13 PM, Maurizio Cimadamore wrote: > Looks great - the only minor quibble is that now StringConcat looks like > a regular javac context class; it even has an instance method - it's > therefore best to follow usual initialization pattern for javac components: That makes sense. I had a concern that Gen and StringConcat are circularly dependent on each other, but this seems to be handled by exposing the underconstructed Gen early. Thanks, -Aleksey
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Thanks again, I'll upload the webrevs after further langtools cleanup. On 11/28/2015 12:05 PM, Andrej Golovnin wrote: >>> http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.04/ > > src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java > > 356 ARG, > > The comma is not needed. Yes. > 548 // Mock the recipe to reuse the concat generator code > 549 StringBuilder sb = new StringBuilder(); > 550 for (int c = 0; c < concatType.parameterCount(); c++) { > 551 sb.append(TAG_ARG); > 552 } > 553 recipe = sb.toString(); > > This can be rewritten as: > > char[] value = new char[concatType.parameterCount()]; > Arrays.fill(value, TAG_ARG); > recipe = new String(value); Yes. > And when you add to the JavaLangAccess interface a new method > newStringUnsafe(byte[] value, byte coder), then you can even use the > package private constructor String(byte[], byte) to avoid copying of > bytes. But I'm not sure if it is worthwhile. Actually, java.lang.StringConcatHelper gains access to that constructor. But this is important only hot path, which is linkage method is hopefully not the part of. > 558 int cCount = 0; > 559 int oCount = 0; > 560 for (int i = 0; i < recipe.length(); i++) { > 561 char c = recipe.charAt(i); > 562 if (c == TAG_CONST) cCount++; > 563 if (c == TAG_ARG) oCount++; > 564 } > > This loop is only needed, when the recipe is already known. When you > move the lines 558 and 559 before the line > > 547 if (generateRecipe) { > > then you can initialize oCount directly after the line 547: > > oCount = concatType.parameterCount(); > > and move the for-loop (lines 560-564) into the else clause of the "if > (generateRecipe)" statement. Yes, okay. > Btw. I would rename the variables cCount and oCount to constCount and > argCount. > > 601 Key key = new Key(mt, rec); > 602 MethodHandle mh = CACHE_ENABLE ? CACHE.get(key) : null; > > If you rewrite this lines as: > > 601 Key key = null; > 602 MethodHandle mh = CACHE_ENABLE ? CACHE.get(key = new > Key(mt, rec)) : null; > > then you can avoid creation of Key objects when caching is disabled. I rearranged the code there, so that Key is never needed when caching is disabled. > 669 SIZED_EXACT(true, true), > > The comma is not needed. Yes. > 672 boolean sized, exact; > > sized and exact should be final. Yes. > 1619 private static Class STRING_HELPER; > 1620 > 1621 static { > 1622 try { > 1623 STRING_HELPER = > Class.forName("java.lang.StringConcatHelper"); > 1624 } catch (ClassNotFoundException e) { > 1625 throw new AssertionError(e); > 1626 } > > Why STRING_HELPER is not final? No reason, overlook. Fixed. > I have a small question to the Recipe class. Is it really better to > create elementsRev instead of using the old-school for-loop and > List.get(int) to loop over the list of RecipeElements in the reverse > order? If there is any reason, then could you please document it to > avoid stupid questions in the future? :-) I just think that a for-each loop over a collection is simpler there. > src/java.base/share/classes/java/lang/invoke/StringConcatException.java > > 31 private static final long serialVersionUID = 292L + 9L; > > I see that the pattern for serialVersionUID like this is already used > in four other classes (BootstrapMethodError, MethodType, > WrongMethodTypeException, LambdaConversioinException). Does it have > some special meaning? I'm just curious. I think this means: JSR 292 (invokedynamic) exception, introduces in JDK 9. StringConcatException follows suit. Thanks, -Aleksey
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Aleksey, >> http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.04/ src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java 356 ARG, The comma is not needed. 548 // Mock the recipe to reuse the concat generator code 549 StringBuilder sb = new StringBuilder(); 550 for (int c = 0; c < concatType.parameterCount(); c++) { 551 sb.append(TAG_ARG); 552 } 553 recipe = sb.toString(); This can be rewritten as: char[] value = new char[concatType.parameterCount()]; Arrays.fill(value, TAG_ARG); recipe = new String(value); And when you add to the JavaLangAccess interface a new method newStringUnsafe(byte[] value, byte coder), then you can even use the package private constructor String(byte[], byte) to avoid copying of bytes. But I'm not sure if it is worthwhile. 558 int cCount = 0; 559 int oCount = 0; 560 for (int i = 0; i < recipe.length(); i++) { 561 char c = recipe.charAt(i); 562 if (c == TAG_CONST) cCount++; 563 if (c == TAG_ARG) oCount++; 564 } This loop is only needed, when the recipe is already known. When you move the lines 558 and 559 before the line 547 if (generateRecipe) { then you can initialize oCount directly after the line 547: oCount = concatType.parameterCount(); and move the for-loop (lines 560-564) into the else clause of the "if (generateRecipe)" statement. Btw. I would rename the variables cCount and oCount to constCount and argCount. 601 Key key = new Key(mt, rec); 602 MethodHandle mh = CACHE_ENABLE ? CACHE.get(key) : null; If you rewrite this lines as: 601 Key key = null; 602 MethodHandle mh = CACHE_ENABLE ? CACHE.get(key = new Key(mt, rec)) : null; then you can avoid creation of Key objects when caching is disabled. 669 SIZED_EXACT(true, true), The comma is not needed. 672 boolean sized, exact; sized and exact should be final. 1619 private static Class STRING_HELPER; 1620 1621 static { 1622 try { 1623 STRING_HELPER = Class.forName("java.lang.StringConcatHelper"); 1624 } catch (ClassNotFoundException e) { 1625 throw new AssertionError(e); 1626 } Why STRING_HELPER is not final? I have a small question to the Recipe class. Is it really better to create elementsRev instead of using the old-school for-loop and List.get(int) to loop over the list of RecipeElements in the reverse order? If there is any reason, then could you please document it to avoid stupid questions in the future? :-) src/java.base/share/classes/java/lang/invoke/StringConcatException.java 31 private static final long serialVersionUID = 292L + 9L; I see that the pattern for serialVersionUID like this is already used in four other classes (BootstrapMethodError, MethodType, WrongMethodTypeException, LambdaConversioinException). Does it have some special meaning? I'm just curious. Best regards, Andrej Golovnin
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Looks great - the only minor quibble is that now StringConcat looks like a regular javac context class; it even has an instance method - it's therefore best to follow usual initialization pattern for javac components: i.e. protected static final Context.Key concatKey = new Context.Key<>(); public static StringCOncat instance(Context context) { StringConcat instance = context.get(unlambdaKey); if (instance == null) { instance = new StringConcat(context); } return instance; } private StringConcat(Context context) { context.put(concatKey, this); //init all context dependent fields } So, this will make StringConcat a regular javac component that can be instantiated with: StringConcat.instance(context) Which would be done inside Gen. Note that this will mess up with the fact that your helpers extends directly from StringConcat, so you will probably need to separate the internal helper classes from the StringConcat API which will be used form outside. Let me know if you need help in getting this wired up correctly. Regarding the tests - I note that you have a good combo test in the JDK repo; one thing that this type of test doesn't cover is if you accidentally get the right result with a bad bytecode. I'll leave the decision as to whether to cover this to you (since the coverage provided by the JDK test is adequate anyway). MAurizio On 27/11/15 17:34, Aleksey Shipilev wrote: Hi Maurizio, Updated webrevs: http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.03/ http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.04/ On 11/27/2015 04:03 AM, Maurizio Cimadamore wrote: Thanks! The patch looks better, and having the code all in one place definitively helps. I think there is still something that you can do to improve the code - i.e. all strategies seem to do: ... Then, all you have to do is to create the right helper given the command line options, and that object will have an API that can be used by Gen to perform string concact effectively, w/o switching and minimizing duplication. What do you think? I agree, these does look better, see the updated webrev. Thanks, -Aleksey
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Andrej, On 11/27/2015 03:29 PM, Andrej Golovnin wrote: >> http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.03/ > > src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java > > The inner class Key is final. But the inner classes Recipe and > RecipeElement are not final. Why? > > The #equals method of the Recipe class has this line: > > 295 return elements.equals(recipe.elements); > > where elements is a list of RecipeElements. But RecipeElement does not > implement the #equals method. Is it intended or a mistake? > I think, the RecipeElement class must implement the #equals and > #hashCode methods. In other case the cache will not work as expected. Ow! Thanks, these are fixed in the updated webrevs! http://cr.openjdk.java.net/~shade/8085796/ Cheers, -Aleksey
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Maurizio, Updated webrevs: http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.03/ http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.04/ On 11/27/2015 04:03 AM, Maurizio Cimadamore wrote: > Thanks! The patch looks better, and having the code all in one place > definitively helps. I think there is still something that you can do to > improve the code - i.e. all strategies seem to do: ... > Then, all you have to do is to create the right helper given the command > line options, and that object will have an API that can be used by Gen > to perform string concact effectively, w/o switching and minimizing > duplication. > > What do you think? I agree, these does look better, see the updated webrev. Thanks, -Aleksey
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi, Again, I'll post the updates after addressing another round of Maurizio's comments :) On 11/27/2015 10:47 AM, Andrej Golovnin wrote: >> http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.02/ > > test/tools/javac/T5024091/T5024091.java > src/jdk.compiler/share/classes/com/sun/tools/javac/jvm/StringConcat.java > > The files do not have the copyright header. Thanks again, added. > And I have one stupid question to com.sun.tools.javac.jvm.StringConcat: > > The constants TAG_ARG and TAG_CONST are defined as Strings. From > performance standpoint of view, would it be not better to use char > instead of String? > > I understand that: > > if (a.contains(TAG_CONST) || a.contains(TAG_ARG)) > > is easier to read than: > > if (a.indexOf(TAG_CONST) != -1 || a.indexOf(TAG_ARG) != -1) > > But when you have a huge project to compile, maybe it can help to > reduce the compile time. Yes, it might makes sense to do this microoptimization. Done. > The line 330 in com.sun.tools.javac.jvm.StringConcat: > > 330 recipe.append("null"); > > Maybe it is better to rewrite it as: > > 330 recipe.append((String) null); > The first one requires to copy byte values from the "null" String. The > second one ends up in the call to AsbtractStringBuilder.appendNull(). Ditto. Thanks, -Aleksey
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Alexey! Not a thorough review, but a few of minor comments: 1) As you touch Long.java anyway, could you change 508 while (i <= Integer.MIN_VALUE) { to 508 while (i < Integer.MIN_VALUE) { , which may save us a half of nanosecond on some inputs? And the same on the line# 563. 2) Integer.java and Long.java In the javadoc, just for a sake of accuracy: 550 * @return index of the most significant digit *or minus sign, if present* 3) StringConcatFactory.java 255 acc = new StringBuilder(); wouldn't it be a bit more efficient to do `acc.setLength(0)`? 4) StringConcatFactory.java 586 try { 587 mh = generate(caller, mt, rec); 588 } catch (Throwable t) { 589 if (t instanceof StringConcatException) { 590 throw (StringConcatException)t; 591 } else { 592 throw new StringConcatException("Generator failed", t); 593 } 594 } it seems to be a bit more straight-forward: try { mh = generate(caller, mt, rec); } catch (StringConcatException sce) { throw sce; } catch (Throwable t) { throw new StringConcatException("Generator failed", t); } Sincerely yours, Ivan On 19.11.2015 23:58, Aleksey Shipilev wrote: Hi, I would like to have a code pre-review for Indify String Concat changes. For the reference, the rationale, design constraints, etc. are outlined in the JEP itself: https://bugs.openjdk.java.net/browse/JDK-8085796 The experimental notes, including the bytecode shapes, benchmark data, and benchmark analysis, interaction with Compact Strings are here: http://cr.openjdk.java.net/~shade/8085796/notes.txt Javadoc (CCC is in progress): http://cr.openjdk.java.net/~shade/8085796/javadocs/StringConcatFactory.html === javac change: Webrev: http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.00/ A little guide: * We cut in at the same two points current javac does the String concat. But instead of emitting the StringBuilder append chains, we collect all arguments and hand them over to JDK's java.lang.invoke.StringConcatFactory. * The bytecode flavor is guarded by hidden -XDstringConcat option, with three possible values: inline, indy, indyWithConstants. "indyWithConstants" is selected as the default bytecode flavor. * Since a String concat expression may contain more operands than any Java call can manage, we sometimes have to peel the concat in several back-to-back calls, and concat-ting the results. That peeling is one-level, and can accommodate 200*200 = 4 arguments, well beyond what can be achieved for a singular String concat expression (limited either by constant pool size, or by method code size). === JDK change: Webrev: http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.00/ A little guide: * The entry point is java.lang.invoke.StringConcatFactory. Its methods would be used as invokedynamic bootstrap methods, at least in javac. * There are multiple concatenation strategies supported by SCF. We explored the performance characteristics of all factories, chosen one performing good throughput- and startup-wise. All strategies are fully functional, covered by tests, and kept as the base for the future work and fine tuning. * Most strategies delegate to public StringBuilder API, which makes them quite simple, since they are not forced to deal with actual storage (this got complicated with Compact Strings). * The only strategy that builds the String directly is MH_INLINE_SIZED_EXACT strategy, and thus it is the most complicated of all. It uses private java.lang.StringConcatHelper class to get access to package-private (Integer|Long).(stringSize|getChar*) methods; the access to j.l.SCH is granted by a private lookup. * Some tests assume the particular code shape and/or invokedynamic counts, needed to fix those as well. === Build change (I don't copy build-dev@ to avoid spamming that list with langtools/jdk reviews; this section is FYI) Webrev: http://cr.openjdk.java.net/~shade/8085796/webrev.root.00/ * This one is simple: we exempt java.base and interim classes from Indy String Concat to avoid bootstrapping issues. Thanks, -Aleksey
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Aleksey, > http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.03/ src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java The inner class Key is final. But the inner classes Recipe and RecipeElement are not final. Why? The #equals method of the Recipe class has this line: 295 return elements.equals(recipe.elements); where elements is a list of RecipeElements. But RecipeElement does not implement the #equals method. Is it intended or a mistake? I think, the RecipeElement class must implement the #equals and #hashCode methods. In other case the cache will not work as expected. Best regards, Andrej Golovnin
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Ivan, Thanks for the reviews. Again, I will publish the updated webrevs after rewiring for Maurizio's comments. On 11/27/2015 02:33 PM, Ivan Gerasimov wrote: > 1) As you touch Long.java anyway, could you change > 508 while (i <= Integer.MIN_VALUE) { > to > 508 while (i < Integer.MIN_VALUE) { > , which may save us a half of nanosecond on some inputs? > > And the same on the line# 563. No, I think that deserves a separate round of Long/Integer.getChars cleanups, if any. Let's not clobber the Indy String Concat with a general optimization in the class library. > 2) Integer.java and Long.java > In the javadoc, just for a sake of accuracy: > 550 * @return index of the most significant digit *or minus sign, > if present* Yes, this is good. Fixed. > 3) StringConcatFactory.java > 255 acc = new StringBuilder(); > wouldn't it be a bit more efficient to do `acc.setLength(0)`? Yes, this is good too. Fixed. > 4) StringConcatFactory.java > > 586 try { > 587 mh = generate(caller, mt, rec); > 588 } catch (Throwable t) { > 589 if (t instanceof StringConcatException) { > 590 throw (StringConcatException)t; > 591 } else { > 592 throw new StringConcatException("Generator > failed", t); > 593 } > 594 } > > it seems to be a bit more straight-forward: > try { > mh = generate(caller, mt, rec); > } catch (StringConcatException sce) { > throw sce; > } catch (Throwable t) { > throw new StringConcatException("Generator failed", t); > } Ah, right, this is better, fixed. Thanks, -Aleksey
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Thanks! The patch looks better, and having the code all in one place definitively helps. I think there is still something that you can do to improve the code - i.e. all strategies seem to do: * collect arguments (on a JCAssignOp and on a JCBinary) * build indy node Currently you have switches on both collect() and xyzConcat methods; what I was trying to suggest is - instead of doing: switch (cond) { case A: case B } Can we organize the code using a class hierarchy - i.e. abstract class StringConcatHelper { List collectAll(JCAssignOp) { ... } //implemented in terms of collect(JCTree, JCTree, as now) List collectAll(JCBinary) { ... } //implemented in terms of collect(JCTree, JCTree, as now) abstract void emit(List args, JCTree tree, Type type); private List collectAll(JCTree, JCTree) { //as current impl } } and then have a bunch of subclasses: class InlineConcatHelper extends StringConcatHelper { //implements inline emit here } abstract class IndyConcatHelper extends StringConcatHelper { //common logic to all indy-based strategy goes here } class ConstantIndyConcatHelper extends IndyConcatHelper { //constant indy logic here } class PlainIndyConcatHelper extends IndyConcatHelper { //plain indy logic here } Then, all you have to do is to create the right helper given the command line options, and that object will have an API that can be used by Gen to perform string concact effectively, w/o switching and minimizing duplication. What do you think? Maurizio On 27/11/15 00:20, Aleksey Shipilev wrote: Hi Maurizio, Thanks for the reviews! Updated webrevs: http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.02/ http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.03/ On 11/26/2015 02:40 PM, Maurizio Cimadamore wrote: types.boxedClass(syms.voidType).type; Note that j.l.Void is always created (see call to synthetizeBoxTypeIfMissing(voidType) - but there are (or used to be) corner cases where j.l.Void is not available on certain platforms (i.e. mobile) so we can't rely on it being always there (as it will lead to crashes). The pattern above is safer, as it relies on javac ability to fake a box type if it's missing from the platform. Excellent, thanks! Didn't know that nuisance with j.l.Void availability. Removed in favor of the suggested pattern. * The code for initializing options in Gen is what we would normally model with an enum. I.e. you need something like: enum StringConcatMode { INLINE, INDY_PLAIN, INDY_CONSTANTS; } And then put some logic inside the enum class to parse the option and to return the right enum constant. The code will get better as you can model two constants (allowIndyStringConcat and indyStringConcatConstants) with a single enum value. Yes, indeed, done. * From a design point of view, I see the logic for generating the string concat indy calls as split into two components - one that collects arguments, iterates through the peels and the merges them together; this part is independent from the particular indy strategy used. There's also a second part - the one that generates the list of static/dynamic args (given the flattened args to the string concat) and then generates the final indy call. This latter part is variable, i.e. you get different logic for different indy strategies - which means the code ends up with many if/else blocks. A possible suggestioin is to make the code more object oriented - keep the first part as is, but have an helper class to do the job of mapping an argument into a static arg/recipe and then generate the final indy call. In other wordas, what you need is a builder pattern - you need an object that you can append args to and, when finished, you can call a method to generate the resulting indy call. There will be a builder for each indy strategy. This is just a suggestion on how I would organize the code - it's not a blocking comment - but I see the current strategy not scaling very well if e.g. you start adding more strategies. Yes, I stepped further and created a separate helper class that does all the String concat. Moved all the logic, including the current StringBuilder-based one there, this lets to un-clutter Gen itself. Please see the updated webrevs. I would need to polish langtools changes a bit more, but tell me if the general direction is what you like. * I need some confirmation here // Concat the String representation of the constant, except + // for the case it contains special tags, which requires us + // to expose it as detached constant. + String a = arg.type.stringValue(); + if (a.contains(TAG_CONST) || a.contains(TAG_ARG)) { + recipe.append(TAG_CONST); + staticArgs.add(a); + } else { + recipe.append(a); + } My understanding is that your strategy builds a 'recipe' which is basically the result of the concat. Since some of the arguments are dynamic, you need to put special markers in the recipe, to tell the BSM to fetch the
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Maurizio, Thanks for the reviews! Updated webrevs: http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.02/ http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.03/ On 11/26/2015 02:40 PM, Maurizio Cimadamore wrote: > types.boxedClass(syms.voidType).type; > > Note that j.l.Void is always created (see call to > synthetizeBoxTypeIfMissing(voidType) - but there are (or used to be) > corner cases where j.l.Void is not available on certain platforms (i.e. > mobile) so we can't rely on it being always there (as it will lead to > crashes). The pattern above is safer, as it relies on javac ability to > fake a box type if it's missing from the platform. Excellent, thanks! Didn't know that nuisance with j.l.Void availability. Removed in favor of the suggested pattern. > * The code for initializing options in Gen is what we would normally > model with an enum. I.e. you need something like: > > enum StringConcatMode { >INLINE, >INDY_PLAIN, >INDY_CONSTANTS; > } > > And then put some logic inside the enum class to parse the option and to > return the right enum constant. The code will get better as you can > model two constants (allowIndyStringConcat and > indyStringConcatConstants) with a single enum value. Yes, indeed, done. > * From a design point of view, I see the logic for generating the string > concat indy calls as split into two components - one that collects > arguments, iterates through the peels and the merges them together; this > part is independent from the particular indy strategy used. There's also > a second part - the one that generates the list of static/dynamic args > (given the flattened args to the string concat) and then generates the > final indy call. This latter part is variable, i.e. you get different > logic for different indy strategies - which means the code ends up with > many if/else blocks. A possible suggestioin is to make the code more > object oriented - keep the first part as is, but have an helper class to > do the job of mapping an argument into a static arg/recipe and then > generate the final indy call. In other wordas, what you need is a > builder pattern - you need an object that you can append args to and, > when finished, you can call a method to generate the resulting indy > call. There will be a builder for each indy strategy. This is just a > suggestion on how I would organize the code - it's not a blocking > comment - but I see the current strategy not scaling very well if e.g. > you start adding more strategies. Yes, I stepped further and created a separate helper class that does all the String concat. Moved all the logic, including the current StringBuilder-based one there, this lets to un-clutter Gen itself. Please see the updated webrevs. I would need to polish langtools changes a bit more, but tell me if the general direction is what you like. > * I need some confirmation here > > // Concat the String representation of the constant, except > + // for the case it contains special tags, which requires us > + // to expose it as detached constant. > + String a = arg.type.stringValue(); > + if (a.contains(TAG_CONST) || a.contains(TAG_ARG)) { > + recipe.append(TAG_CONST); > + staticArgs.add(a); > + } else { > + recipe.append(a); > + } > > > My understanding is that your strategy builds a 'recipe' which is > basically the result of the concat. Since some of the arguments are > dynamic, you need to put special markers in the recipe, to tell the BSM > to fetch the dynamic argument and stick it in there. Correct? This in > turn creates a problem, as it's possible for a concat _constant_ > argument to contain the special values themselves - if that's the case, > you use static arguments essentially as an escaping mechanism, right? Yes, that is a correct understanding. The credit for the idea actually goes to John Rose. This scheme has a nice implication that *most* recipes would have only the recipe string + dynamic arguments, and no other static arguments. Which saves space in constant pool. Also, some compilers may find it easier to avoid building a complicated recipe, and rather pass the String constants untouched. There is a guidance in StringConcatFactory for that: @apiNote Code generators have three distinct ways to process a constant string operand S in a string concatenation expression. First, S can be materialized as a reference (using ldc) and passed as an ordinary argument (recipe '\1'). Or, S can be stored in the constant pool and passed as a constant (recipe '\2') . Finally, if S contains neither of the recipe tag characters ('\1', '\2') then S can be interpolated into the recipe itself, causing its characters to be inserted into the result. > * I would expect more of a combinatorial test where you check i.e. > string concat of different sizes and args - i.e. the dimension of the > combinatorial spaces are: > > - size - how many args to the string concat? This as non obvious > consequences in terms of
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Aleksey, > http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.02/ src/java.base/share/classes/java/lang/Integer.java src/java.base/share/classes/java/lang/Long.java I miss JavaDocs for the returned value of the methods #getChars(int, int, byte[]) and #getCharsUTF16(int, int, byte[]). And it would be nice to have JavaDocs for the parameters too. src/java.base/share/classes/java/lang/StringConcatHelper.java src/java.base/share/classes/java/lang/invoke/StringConcatException.java test/java/lang/String/concat/ImplicitStringConcatShapesTestGen.java test/java/lang/String/concat/update-tests.sh The files do not have a copyright header. Best regards, Andrej Golovnin
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Alex, some comments on the langtools code below: * please remove the new voidClassType constant from Symtab - and use this idiom instead: types.boxedClass(syms.voidType).type; Note that j.l.Void is always created (see call to synthetizeBoxTypeIfMissing(voidType) - but there are (or used to be) corner cases where j.l.Void is not available on certain platforms (i.e. mobile) so we can't rely on it being always there (as it will lead to crashes). The pattern above is safer, as it relies on javac ability to fake a box type if it's missing from the platform. * The code for initializing options in Gen is what we would normally model with an enum. I.e. you need something like: enum StringConcatMode { INLINE, INDY_PLAIN, INDY_CONSTANTS; } And then put some logic inside the enum class to parse the option and to return the right enum constant. The code will get better as you can model two constants (allowIndyStringConcat and indyStringConcatConstants) with a single enum value. * From a design point of view, I see the logic for generating the string concat indy calls as split into two components - one that collects arguments, iterates through the peels and the merges them together; this part is independent from the particular indy strategy used. There's also a second part - the one that generates the list of static/dynamic args (given the flattened args to the string concat) and then generates the final indy call. This latter part is variable, i.e. you get different logic for different indy strategies - which means the code ends up with many if/else blocks. A possible suggestioin is to make the code more object oriented - keep the first part as is, but have an helper class to do the job of mapping an argument into a static arg/recipe and then generate the final indy call. In other wordas, what you need is a builder pattern - you need an object that you can append args to and, when finished, you can call a method to generate the resulting indy call. There will be a builder for each indy strategy. This is just a suggestion on how I would organize the code - it's not a blocking comment - but I see the current strategy not scaling very well if e.g. you start adding more strategies. * I need some confirmation here // Concat the String representation of the constant, except + // for the case it contains special tags, which requires us + // to expose it as detached constant. + String a = arg.type.stringValue(); + if (a.contains(TAG_CONST) || a.contains(TAG_ARG)) { + recipe.append(TAG_CONST); + staticArgs.add(a); + } else { + recipe.append(a); + } My understanding is that your strategy builds a 'recipe' which is basically the result of the concat. Since some of the arguments are dynamic, you need to put special markers in the recipe, to tell the BSM to fetch the dynamic argument and stick it in there. Correct? This in turn creates a problem, as it's possible for a concat _constant_ argument to contain the special values themselves - if that's the case, you use static arguments essentially as an escaping mechanism, right? * I would expect more of a combinatorial test where you check i.e. string concat of different sizes and args - i.e. the dimension of the combinatorial spaces are: - size - how many args to the string concat? This as non obvious consequences in terms of peeling - arg types: constants, dynamic, special constants (i.e. TAG_CONST, TAG_ARG) - strategy used: vanilla, indy_constats, indy_plain - codegen context: assignop vs. binary - target option (-target 8 vs -target 9) There are several combinatorial tests that generate a source and check it on the fly - one example (close to what's needed here) is: http://hg.openjdk.java.net/jdk9/dev/langtools/file/176472b94f2e/test/tools/javac/lambda/bytecode/TestLambdaBytecode.java Maurizio On 19/11/15 20:58, Aleksey Shipilev wrote: Hi, I would like to have a code pre-review for Indify String Concat changes. For the reference, the rationale, design constraints, etc. are outlined in the JEP itself: https://bugs.openjdk.java.net/browse/JDK-8085796 The experimental notes, including the bytecode shapes, benchmark data, and benchmark analysis, interaction with Compact Strings are here: http://cr.openjdk.java.net/~shade/8085796/notes.txt Javadoc (CCC is in progress): http://cr.openjdk.java.net/~shade/8085796/javadocs/StringConcatFactory.html === javac change: Webrev: http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.00/ A little guide: * We cut in at the same two points current javac does the String concat. But instead of emitting the StringBuilder append chains, we collect all arguments and hand them over to JDK's java.lang.invoke.StringConcatFactory. * The bytecode flavor is guarded by hidden -XDstringConcat option, with three possible values: inline, indy, indyWithConstants. "indyWithConstants" is selected as the default bytecode flavor. * Since a String concat
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Andrej, Thanks for the review, I'll send the updated webrevs once I address Maurizio's review comments. On 11/26/2015 11:42 AM, Andrej Golovnin wrote: >> http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.02/ > > src/java.base/share/classes/java/lang/Integer.java > src/java.base/share/classes/java/lang/Long.java > > I miss JavaDocs for the returned value of the methods #getChars(int, > int, byte[]) and #getCharsUTF16(int, int, byte[]). And it would be > nice to have JavaDocs for the parameters too. Added, thanks! > src/java.base/share/classes/java/lang/StringConcatHelper.java > src/java.base/share/classes/java/lang/invoke/StringConcatException.java > test/java/lang/String/concat/ImplicitStringConcatShapesTestGen.java > test/java/lang/String/concat/update-tests.sh > > The files do not have a copyright header. Oops, good catch, added. Thanks, -Aleksey
Re: Code (Pre-)Review for JEP 280: Indify String Concat
On 26/11/15 11:40, Maurizio Cimadamore wrote: * The code for initializing options in Gen is what we would normally model with an enum. I.e. you need something like: enum StringConcatMode { INLINE, INDY_PLAIN, INDY_CONSTANTS; } And then put some logic inside the enum class to parse the option and to return the right enum constant. The code will get better as you can model two constants (allowIndyStringConcat and indyStringConcatConstants) with a single enum value. Forgot - we have examples of this pattern: http://hg.openjdk.java.net/jdk9/jdk9/langtools/file/8356d7a909a2/src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Analyzer.java#l125 Maurizio
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Aleksey, > http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.02/ test/tools/javac/T5024091/T5024091.java src/jdk.compiler/share/classes/com/sun/tools/javac/jvm/StringConcat.java The files do not have the copyright header. And I have one stupid question to com.sun.tools.javac.jvm.StringConcat: The constants TAG_ARG and TAG_CONST are defined as Strings. From performance standpoint of view, would it be not better to use char instead of String? I understand that: if (a.contains(TAG_CONST) || a.contains(TAG_ARG)) is easier to read than: if (a.indexOf(TAG_CONST) != -1 || a.indexOf(TAG_ARG) != -1) But when you have a huge project to compile, maybe it can help to reduce the compile time. The line 330 in com.sun.tools.javac.jvm.StringConcat: 330 recipe.append("null"); Maybe it is better to rewrite it as: 330 recipe.append((String) null); The first one requires to copy byte values from the "null" String. The second one ends up in the call to AsbtractStringBuilder.appendNull(). Best regards, Andrej Golovnin
Re: Code (Pre-)Review for JEP 280: Indify String Concat
On 11/25/2015 03:07 AM, Claes Redestad wrote: > Some variants might be non-existent in a majority of programs, others > might not be needed early on, which could help reduce the footprint a bit. Thanks, indeed, it is savvy to make the MH resolution lazier. See the update: http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.02/ http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.01/ http://cr.openjdk.java.net/~shade/8085796/webrev.root.00/ More reviews please :) Especially the langtools part. Thanks, -Aleksey
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Paul, Thanks for the review! Updated webrevs: http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.01/ http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.01/ More reviews, please :) Comments below: On 11/24/2015 05:16 PM, Paul Sandoz wrote: >> On 19 Nov 2015, at 21:58, Aleksey Shipilev>> wrote: >> Webrev: >> http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.00/ > — > > 134 if ("inline".equals(concat)) { > 135 allowIndyStringConcat = false; > 136 indyStringConcatConstants = false; > 137 } else if ("indy".equals(concat)) { > 138 allowIndyStringConcat = true; > 139 indyStringConcatConstants = false; > 140 } else if ("indyWithConstants".equals(concat)) { > 141 allowIndyStringConcat = true; > 142 indyStringConcatConstants = true; > 143 } else { > 144 Assert.error("Unknown stringConcat: " + concat); > 145 throw new IllegalStateException("Unknown > stringConcat: " + concat); > 146 } > > If you are in anyway inclined you could use a switch on concat. Yes, I was trying not to use (ahem) "new" language features in a language compiler. But it seems javac uses String switches everywhere. Fixed. > I have a marginal preference for an enum StringConcatStrategy with StringBuilder, Indy, IndyWithConstants values. But i will defer to other javac reviewers. I think string values are fine, also because -XD options seem to be Stringly-typed anyhow. >> Webrev: >> http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.00/ >> > My inclination would be to consolidate all the sys props under one doPriv > block. Yes, done. > 200 private static final Map CACHE; > > ConcurrentMap? To make it clearer that this is operated on concurrently. Yes. Fixed. > 287 elements = new ArrayList<>(el); > 288 Collections.reverse(el); > 289 elementsRev = new ArrayList<>(el); > > elementsRev = e1? Yes. Fixed. > It’s mildly surprising that there is no supported way to create a reverse > view of a ListIterator. > 408 if (DEBUG) { > 409 System.out.println("StringConcatFactory " + STRATEGY + " is > here for " + argTypes); > 410 } > > No recursion! :-) Yeah, this code is exempted from Indified String Concat to avoid circularity ;) > > 530 if (caller == null) { > 531 throw new StringConcatException("Lookup is null"); > 532 } > 533 > 534 if (name == null) { > 535 throw new StringConcatException("Name is null"); > 536 } > 537 > 538 if (argTypes == null) { > 539 throw new StringConcatException("Argument types are null"); > 540 } > > It’s odd to not see NPEs being used instead. Yes, should have used Objects.requireNonNull. The NPE exceptions would not ever fire if you invoke BSM through invokedynamic anyhow. Fixed. > Can the generated class extend from an existing compiled class or > reuse static method calls of another class? then it might be possible > to move ASM generation of debug code into java source code. In theory, it can, but I would like to avoid that. The debugging bytecode is not really complicated to warrant moving to a separate Java source file (which should probably required to be public to begin with) and blow up the implementation complexity. MethodHandle-based strategies already do something similar to what you are proposing, but they can use look up private methods. > Here is a wild thought, i dunno if it makes any difference. > U.defineAnonymousClass allows one to patch the constant pool, so the > string constants could be patched in. See! You can experiment with wild ideas like these after the infrastructure lands in JDK. > You seem to be hedging your bets about what static strategy to > choose. Do you intend to whittle down the strategies over time? Same > argument applies to caching. Some time before the JDK 9 release it > would be good to reduce the scope. My rationale for keeping all strategies untouched is based on the premise that most improvements in strategies would probably be some form of existing strategies, and having all strategies handy and tested simplifies improvements. Also, we may have to do a real-life experiments for choosing a better default strategy. > Do you anticipate it might be possible to reduce the scope of the > existing JIT optimisations? Yes, we don't need to extend OptimizeStringConcat heavily anymore, and once all the StringBuilder::append-shaped bytecode phases out of existence, we can drop the compiler optimization on the floor. Thanks, -Aleksey
Re: Code (Pre-)Review for JEP 280: Indify String Concat
Hi Aleksey, in http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.01/src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java.html there are a number of internal strategy classes that will set up a number of MethodHandles in bulk once initialized: 1561 private static final MethodHandle NEW_STRING, NEW_STRING_CHECKED; 1562 private static final MethodHandle NEW_ARRAY; 1563 private static final MapPREPENDERS = new HashMap<>(); 1564 private static final Map LENGTH_MIXERS = new HashMap<>(); 1565 private static final Map CODER_MIXERS = new HashMap<>(); 1580 for (Class ptype : new Class[]{boolean.class, byte.class, char.class, short.class, int.class, long.class, String.class}) { 1581 PREPENDERS.put(ptype, lookupStatic(Lookup.IMPL_LOOKUP, stringHelper, "prepend", int.class, int.class, byte[].class, byte.class, ptype)); 1582 LENGTH_MIXERS.put(ptype, lookupStatic(Lookup.IMPL_LOOKUP, stringHelper, "mixLen", int.class, int.class, ptype)); 1583 CODER_MIXERS.put(ptype, lookupStatic(Lookup.IMPL_LOOKUP, stringHelper, "mixCoder", byte.class, byte.class, ptype)); 1584 } 1585 1586 NEW_STRING = lookupStatic(Lookup.IMPL_LOOKUP, stringHelper, "newString", String.class, byte[].class, byte.class); 1587 NEW_STRING_CHECKED = lookupStatic(Lookup.IMPL_LOOKUP, stringHelper, "newStringChecked", String.class, int.class, byte[].class, byte.class); 1588 NEW_ARRAY = lookupStatic(Lookup.IMPL_LOOKUP, MethodHandleInlineCopyStrategy.class, "newArray", byte[].class, int.class, byte.class); I'm wondering if these could be made more lazily initialized in manners similar to how MethodHandleImpl and BoundMethodHandle were recently updated, e.g.: private static final ConcurrentMap PREPENDERS = new ConcurrentHashMap<>(); private static MethodHandle getPrepender(Class ptype) { return PREPENDERS.computeIfAbsent(ptype, new Function () { @Override public MethodHandle apply(Class ptype) { // will throw AssertionError if ptype not in {boolean.class, byte.class, char.class, short.class, int.class, long.class, String.class} lookupStatic(Lookup.IMPL_LOOKUP, stringHelper, "prepend", int.class, int.class, byte[].class, byte.class, ptype); } }); } Some variants might be non-existent in a majority of programs, others might not be needed early on, which could help reduce the footprint a bit. Thanks! /Claes On 2015-11-24 23:30, Aleksey Shipilev wrote: Hi Paul, Thanks for the review! Updated webrevs: http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.01/ http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.01/ More reviews, please :) Comments below: On 11/24/2015 05:16 PM, Paul Sandoz wrote: On 19 Nov 2015, at 21:58, Aleksey Shipilev wrote: Webrev: http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.00/ — 134 if ("inline".equals(concat)) { 135 allowIndyStringConcat = false; 136 indyStringConcatConstants = false; 137 } else if ("indy".equals(concat)) { 138 allowIndyStringConcat = true; 139 indyStringConcatConstants = false; 140 } else if ("indyWithConstants".equals(concat)) { 141 allowIndyStringConcat = true; 142 indyStringConcatConstants = true; 143 } else { 144 Assert.error("Unknown stringConcat: " + concat); 145 throw new IllegalStateException("Unknown stringConcat: " + concat); 146 } If you are in anyway inclined you could use a switch on concat. Yes, I was trying not to use (ahem) "new" language features in a language compiler. But it seems javac uses String switches everywhere. Fixed. I have a marginal preference for an enum StringConcatStrategy with StringBuilder, Indy, IndyWithConstants values. But i will defer to other javac reviewers. I think string values are fine, also because -XD options seem to be Stringly-typed anyhow. Webrev: http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.00/ My inclination would be to consolidate all the sys props under one doPriv block. Yes, done. 200 private static final Map CACHE; ConcurrentMap? To make it clearer that this is operated on concurrently. Yes. Fixed. 287 elements = new ArrayList<>(el); 288 Collections.reverse(el); 289 elementsRev = new ArrayList<>(el); elementsRev = e1? Yes. Fixed. It’s mildly surprising that there is no supported way to create a reverse view of a ListIterator. 408 if (DEBUG) { 409
Re: Code (Pre-)Review for JEP 280: Indify String Concat
> On 19 Nov 2015, at 21:58, Aleksey Shipilev> wrote: > > Hi, > > I would like to have a code pre-review for Indify String Concat changes. > For the reference, the rationale, design constraints, etc. are outlined > in the JEP itself: > https://bugs.openjdk.java.net/browse/JDK-8085796 > > The experimental notes, including the bytecode shapes, benchmark data, > and benchmark analysis, interaction with Compact Strings are here: > http://cr.openjdk.java.net/~shade/8085796/notes.txt > > Javadoc (CCC is in progress): > > http://cr.openjdk.java.net/~shade/8085796/javadocs/StringConcatFactory.html > > > === javac change: > > Webrev: > http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.00/ > I am not a langtools expert but the code made sense to me (as i have dabbled in hacking javac to generate indy) Gen — 134 if ("inline".equals(concat)) { 135 allowIndyStringConcat = false; 136 indyStringConcatConstants = false; 137 } else if ("indy".equals(concat)) { 138 allowIndyStringConcat = true; 139 indyStringConcatConstants = false; 140 } else if ("indyWithConstants".equals(concat)) { 141 allowIndyStringConcat = true; 142 indyStringConcatConstants = true; 143 } else { 144 Assert.error("Unknown stringConcat: " + concat); 145 throw new IllegalStateException("Unknown stringConcat: " + concat); 146 } If you are in anyway inclined you could use a switch on concat. I have a marginal preference for an enum StringConcatStrategy with StringBuilder, Indy, IndyWithConstants values. But i will defer to other javac reviewers. > A little guide: > > * We cut in at the same two points current javac does the String > concat. But instead of emitting the StringBuilder append chains, we > collect all arguments and hand them over to JDK's > java.lang.invoke.StringConcatFactory. > > * The bytecode flavor is guarded by hidden -XDstringConcat option, with > three possible values: inline, indy, indyWithConstants. > "indyWithConstants" is selected as the default bytecode flavor. > > * Since a String concat expression may contain more operands than any > Java call can manage, we sometimes have to peel the concat in several > back-to-back calls, and concat-ting the results. That peeling is > one-level, and can accommodate 200*200 = 4 arguments, well beyond > what can be achieved for a singular String concat expression (limited > either by constant pool size, or by method code size). > > > === JDK change: > > Webrev: > http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.00/ > StringConcatFactory — 187 final String key = "java.lang.invoke.stringConcat.debug"; 188 String str = AccessController.doPrivileged( 189 new GetPropertyAction(key), null, 190 new PropertyPermission(key, "read")); 191 DEBUG = Boolean.valueOf(str); See also Boolean.getBoolean My inclination would be to consolidate all the sys props under one doPriv block. 200 private static final Map CACHE; ConcurrentMap? To make it clearer that this is operated on concurrently. 287 elements = new ArrayList<>(el); 288 Collections.reverse(el); 289 elementsRev = new ArrayList<>(el); elementsRev = e1? It’s mildly surprising that there is no supported way to create a reverse view of a ListIterator. 408 if (DEBUG) { 409 System.out.println("StringConcatFactory " + STRATEGY + " is here for " + argTypes); 410 } No recursion! :-) 530 if (caller == null) { 531 throw new StringConcatException("Lookup is null"); 532 } 533 534 if (name == null) { 535 throw new StringConcatException("Name is null"); 536 } 537 538 if (argTypes == null) { 539 throw new StringConcatException("Argument types are null"); 540 } It’s odd to not see NPEs being used instead. Can the generated class extend from an existing compiled class or reuse static method calls of another class? then it might be possible to move ASM generation of debug code into java source code. Here is a wild thought, i dunno if it makes any difference. U.defineAnonymousClass allows one to patch the constant pool, so the string constants could be patched in. You seem to be hedging your bets about what static strategy to choose. Do you intend to whittle down the strategies over time? Same argument applies to caching. Some time before the JDK 9 release it would be good to reduce the scope. Do you anticipate it might be possible to reduce the scope of the existing JIT optimisations? Paul. > A little guide: > > * The entry point is