Re: Code (Pre-)Review for JEP 280: Indify String Concat

Aleksey Shipilev Thu, 26 Nov 2015 16:21:22 -0800

Hi Maurizio,

Thanks for the reviews!

Updated webrevs:
  http://cr.openjdk.java.net/~shade/8085796/webrev.langtools.02/
  http://cr.openjdk.java.net/~shade/8085796/webrev.jdk.03/

On 11/26/2015 02:40 PM, Maurizio Cimadamore wrote:
> types.boxedClass(syms.voidType).type;
> 
> Note that j.l.Void is always created (see call to
> synthetizeBoxTypeIfMissing(voidType) - but there are (or used to be)
> corner cases where j.l.Void is not available on certain platforms (i.e.
> mobile) so we can't rely on it being always there (as it will lead to
> crashes). The pattern above is safer, as it relies on javac ability to
> fake a box type if it's missing from the platform.

Excellent, thanks! Didn't know that nuisance with j.l.Void availability.
Removed in favor of the suggested pattern.

> * The code for initializing options in Gen is what we would normally
> model with an enum. I.e. you need something like:
> 
> enum StringConcatMode {
>    INLINE,
>    INDY_PLAIN,
>    INDY_CONSTANTS;
> }
> 
> And then put some logic inside the enum class to parse the option and to
> return the right enum constant. The code will get better as you can
> model two constants (allowIndyStringConcat and
> indyStringConcatConstants) with a single enum value.

Yes, indeed, done.

> * From a design point of view, I see the logic for generating the string
> concat indy calls as split into two components - one that collects
> arguments, iterates through the peels and the merges them together; this
> part is independent from the particular indy strategy used. There's also
> a second part - the one that generates the list of static/dynamic args
> (given the flattened args to the string concat) and then generates the
> final indy call. This latter part is variable, i.e. you get different
> logic for different indy strategies - which means the code ends up with
> many if/else blocks. A possible suggestioin is to make the code more
> object oriented - keep the first part as is, but have an helper class to
> do the job of mapping an argument into a static arg/recipe and then
> generate the final indy call. In other wordas, what you need is a
> builder pattern - you need an object that you can append args to and,
> when finished, you can call a method to generate the resulting indy
> call. There will be a builder for each indy strategy. This is just a
> suggestion on how I would organize the code - it's not a blocking
> comment - but I see the current strategy not scaling very well if e.g.
> you start adding more strategies.

Yes, I stepped further and created a separate helper class that does all
the String concat. Moved all the logic, including the current
StringBuilder-based one there, this lets to un-clutter Gen itself.

Please see the updated webrevs. I would need to polish langtools changes
a bit more, but tell me if the general direction is what you like.

> * I need some confirmation here
> 
> // Concat the String representation of the constant, except
> + // for the case it contains special tags, which requires us
> + // to expose it as detached constant.
> + String a = arg.type.stringValue();
> + if (a.contains(TAG_CONST) || a.contains(TAG_ARG)) {
> + recipe.append(TAG_CONST);
> + staticArgs.add(a);
> + } else {
> + recipe.append(a);
> + }
> 
> 
> My understanding is that your strategy builds a 'recipe' which is
> basically the result of the concat. Since some of the arguments are
> dynamic, you need to put special markers in the recipe, to tell the BSM
> to fetch the dynamic argument and stick it in there. Correct? This in
> turn creates a problem, as it's possible for a concat _constant_
> argument to contain the special values themselves - if that's the case,
> you use static arguments essentially as an escaping mechanism, right?

Yes, that is a correct understanding. The credit for the idea actually
goes to John Rose. This scheme has a nice implication that *most*
recipes would have only the recipe string + dynamic arguments, and no
other static arguments. Which saves space in constant pool. Also, some
compilers may find it easier to avoid building a complicated recipe, and
rather pass the String constants untouched.

There is a guidance in StringConcatFactory for that:

 @apiNote Code generators have three distinct ways to process a constant
 string operand S in a string concatenation expression.  First, S can be
 materialized as a reference (using ldc) and passed as an ordinary
 argument (recipe '\1'). Or, S can be stored in the constant pool and
 passed as a constant (recipe '\2') . Finally, if S contains neither of
 the recipe tag characters ('\1', '\2') then S can be interpolated into
 the recipe itself, causing its characters to be inserted into the
 result.

> * I would expect more of a combinatorial test where you check i.e.
> string concat of different sizes and args - i.e. the dimension of the
> combinatorial spaces are:
> 
> - size - how many args to the string concat? This as non obvious
> consequences in terms of peeling
> - arg types: constants, dynamic, special constants (i.e. TAG_CONST, TAG_ARG)
> - strategy used: vanilla, indy_constats, indy_plain
> - codegen context: assignop vs. binary
> - target option (-target 8 vs -target 9)

In fact, we have these tests in jdk/test/java/lang/String concat, see
the jdk webrev. The rationale for keeping these tests in jdk: you would
want to test the JDK concat code along with those combinatorial tests,
and verify the *actual* concatenation result is sane. Simply generating
the bytecode and trying to verify it seems to be a half-way approach to
the problem (also amenable to bugs in your own bytecode verifier!)

Thanks,
-Aleksey

Re: Code (Pre-)Review for JEP 280: Indify String Concat

Reply via email to