Re: Code (Pre-)Review for JEP 280: Indify String Concat

Maurizio Cimadamore Thu, 26 Nov 2015 17:05:04 -0800

Thanks! The patch looks better, and having the code all in one placedefinitively helps. I think there is still something that you can do toimprove the code - i.e. all strategies seem to do:


* collect arguments (on a JCAssignOp and on a JCBinary)
* build indy node

Currently you have switches on both collect() and xyzConcat methods;what I was trying to suggest is - instead of doing:


<common code>
switch (cond) {
   case A:
      <do strategy x>
   case B
      <do strategy y>
}

Can we organize the code using a class hierarchy - i.e.

abstract class StringConcatHelper {

List<JCTree> collectAll(JCAssignOp) { ... } //implemented in termsof collect(JCTree, JCTree, as now)List<JCTree> collectAll(JCBinary) { ... } //implemented in terms ofcollect(JCTree, JCTree, as now)

    abstract void emit(List<JCTree> args, JCTree tree, Type type);

    private List<JCTree> collectAll(JCTree, JCTree) { //as current impl }
}

and then have a bunch of subclasses:

class InlineConcatHelper extends StringConcatHelper {
   //implements inline emit here
}

abstract class IndyConcatHelper extends StringConcatHelper {
  //common logic to all indy-based strategy goes here
}

class ConstantIndyConcatHelper extends IndyConcatHelper {
  //constant indy logic here
}

class PlainIndyConcatHelper extends IndyConcatHelper {
  //plain indy logic here
}

Then, all you have to do is to create the right helper given the commandline options, and that object will have an API that can be used by Gento perform string concact effectively, w/o switching and minimizingduplication.


What do you think?

Maurizio

On 27/11/15 00:20, Aleksey Shipilev wrote:

Hi Maurizio,

Thanks for the reviews!

Updated webrevs:
   http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.02/
   http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.03/


On 11/26/2015 02:40 PM, Maurizio Cimadamore wrote:

types.boxedClass(syms.voidType).type;

Note that j.l.Void is always created (see call to
synthetizeBoxTypeIfMissing(voidType) - but there are (or used to be)
corner cases where j.l.Void is not available on certain platforms (i.e.
mobile) so we can't rely on it being always there (as it will lead to
crashes). The pattern above is safer, as it relies on javac ability to
fake a box type if it's missing from the platform.

Excellent, thanks! Didn't know that nuisance with j.l.Void availability.
Removed in favor of the suggested pattern.

* The code for initializing options in Gen is what we would normally
model with an enum. I.e. you need something like:

enum StringConcatMode {
    INLINE,
    INDY_PLAIN,
    INDY_CONSTANTS;
}

And then put some logic inside the enum class to parse the option and to
return the right enum constant. The code will get better as you can
model two constants (allowIndyStringConcat and
indyStringConcatConstants) with a single enum value.

Yes, indeed, done.

* From a design point of view, I see the logic for generating the string
concat indy calls as split into two components - one that collects
arguments, iterates through the peels and the merges them together; this
part is independent from the particular indy strategy used. There's also
a second part - the one that generates the list of static/dynamic args
(given the flattened args to the string concat) and then generates the
final indy call. This latter part is variable, i.e. you get different
logic for different indy strategies - which means the code ends up with
many if/else blocks. A possible suggestioin is to make the code more
object oriented - keep the first part as is, but have an helper class to
do the job of mapping an argument into a static arg/recipe and then
generate the final indy call. In other wordas, what you need is a
builder pattern - you need an object that you can append args to and,
when finished, you can call a method to generate the resulting indy
call. There will be a builder for each indy strategy. This is just a
suggestion on how I would organize the code - it's not a blocking
comment - but I see the current strategy not scaling very well if e.g.
you start adding more strategies.

Yes, I stepped further and created a separate helper class that does all
the String concat. Moved all the logic, including the current
StringBuilder-based one there, this lets to un-clutter Gen itself.

Please see the updated webrevs. I would need to polish langtools changes
a bit more, but tell me if the general direction is what you like.

* I need some confirmation here

// Concat the String representation of the constant, except
+ // for the case it contains special tags, which requires us
+ // to expose it as detached constant.
+ String a = arg.type.stringValue();
+ if (a.contains(TAG_CONST) || a.contains(TAG_ARG)) {
+ recipe.append(TAG_CONST);
+ staticArgs.add(a);
+ } else {
+ recipe.append(a);
+ }


My understanding is that your strategy builds a 'recipe' which is
basically the result of the concat. Since some of the arguments are
dynamic, you need to put special markers in the recipe, to tell the BSM
to fetch the dynamic argument and stick it in there. Correct? This in
turn creates a problem, as it's possible for a concat _constant_
argument to contain the special values themselves - if that's the case,
you use static arguments essentially as an escaping mechanism, right?

Yes, that is a correct understanding. The credit for the idea actually
goes to John Rose. This scheme has a nice implication that *most*
recipes would have only the recipe string + dynamic arguments, and no
other static arguments. Which saves space in constant pool. Also, some
compilers may find it easier to avoid building a complicated recipe, and
rather pass the String constants untouched.

There is a guidance in StringConcatFactory for that:

  @apiNote Code generators have three distinct ways to process a constant
  string operand S in a string concatenation expression.  First, S can be
  materialized as a reference (using ldc) and passed as an ordinary
  argument (recipe '\1'). Or, S can be stored in the constant pool and
  passed as a constant (recipe '\2') . Finally, if S contains neither of
  the recipe tag characters ('\1', '\2') then S can be interpolated into
  the recipe itself, causing its characters to be inserted into the
  result.

* I would expect more of a combinatorial test where you check i.e.
string concat of different sizes and args - i.e. the dimension of the
combinatorial spaces are:

- size - how many args to the string concat? This as non obvious
consequences in terms of peeling
- arg types: constants, dynamic, special constants (i.e. TAG_CONST, TAG_ARG)
- strategy used: vanilla, indy_constats, indy_plain
- codegen context: assignop vs. binary
- target option (-target 8 vs -target 9)

In fact, we have these tests in jdk/test/java/lang/String concat, see
the jdk webrev. The rationale for keeping these tests in jdk: you would
want to test the JDK concat code along with those combinatorial tests,
and verify the *actual* concatenation result is sane. Simply generating
the bytecode and trying to verify it seems to be a half-way approach to
the problem (also amenable to bugs in your own bytecode verifier!)


Thanks,
-Aleksey

Re: Code (Pre-)Review for JEP 280: Indify String Concat

Reply via email to