Re: Review request for JDK-8042118

Hannes Wallnoefer Thu, 08 May 2014 01:38:16 -0700

Looks good to me. Only nitpick is a Javadoc error which I commented inCrucible.


Hannes


Am 2014-05-05 14:31, schrieb Attila Szegedi:

Please review JDK-8042118 at 
<http://cr.openjdk.java.net/~attila/8042118/webrev.00>

Note that this is a pretty large changeset, because the change is 
architecturally big. Namely, it separates types from symbols, and as you can 
imagine that ripples through the codebase into pretty deep places, together 
with the need to introduce calculation of static types for local variables etc.

The overall benefit (in addition to cleaner architecture for some of the 
compiler stages) is a noticeable performance improvement we register on 
benchmarks, stemming from the fact that the compiler can now treat a single 
local variable as having different types in different parts of the function, 
and optimize accordingly.

I actually wrote up fairly detailed change notes for this, they are below:

AccessNode.property is now a String, not an IdentNode
Symbol class no longer contains a Type. Expressions have types directly now, 
either explicit or calculated from their subexpressions.
Only identifiers (IdentNode) have symbols.
Corollary: there are no longer temporary symbols; Symbol.IS_TEMP is gone, as is 
TemporarySymbols class.
Attr is separated into three phases: AssignSymbols, OptimisticTypesCalculator 
and LocalVariableTypesCalculator.
RangeAnalyzer class is gone. We don’t really need range analysis with 
optimistic and lvar type calculation.
FinalizeTypes class is gone. Fixups that it did are no longer needed.
DISCARD node is gone. The knowledge about whether it’s safe to discard a 
computation is moved into CodeGenerator which has a loadAndDiscard()method now. 
Some side-effect free discarded computations are now completely eliminated by 
the code generator.
BreakNode and ContinueNode now have a common superclass, JumpStatement.
There’s a new class named LocalVariableConversion, that is basically a linked 
list of triplets of (symbol, fromType, toType) specifying symbol type widenings 
that need to be performed on entry to a control flow join points.
There’s a new interface named JoinPredecessor. It’s implemented by AST nodes 
that need to carry a LocalVariableConversion as they can either originate an 
edge to a control flow join point, or can even fully contain one. 
JoinPredecessor only provides a getter and setter for a 
LocalVariableConversion; the semantics of control flow for a particular AST 
node are shared between the LocalVariableTypesCalculator that creates the 
conversion, and the CodeGenerator that uses it.
BranchOptimizer can generate code that inserts local variable conversions on 
control flow join edges, since short-circuiting logical AND and OR operators 
contain control flow joins.
Sometimes, an arbitrary expression needs to act as a JoinPredecessor (e.g. a 
test of a for or a while loop precedes the join point that is the target of the 
continue statement). JoinPredecessorExpression is introduced as a wrapper for 
an arbitrary Expression that decorates it with JoinPredecessorfunctionality.
A single local variable can use multiple JVM method local variable slots now. 
If it is assigned both an int and later an Object, it’ll have two slots: one 
for int value, and one for Object. In most extreme case, it can end up using 6 
slots: 1 for int, 2 for long, 2 for double, and 1 for Object value.
Consequently, if the type calculator can prove that a variable is never read 
from a particular type, it will not allocate it a slot. A variable that is 
never read will end up taking up 0 slots and thus effectively get eliminated.
On entry to generated methods, it is sometimes now required to copy values from 
incoming parameter slots if one or more parameter will also later be assigned a 
different type (storage for a single symbol is continous). E.g. in the below 
snippet:
  function f(a, b) { a = 1.0; print(a); a = ""; print(a); return b;}
  f(1, 2);
“b” needs to be moved from incoming slot 3 for f(callee, this, int a, int b) to 
slot 6, as “a” needs 3 additional slots for double and Object values.
As a consequence, inferParameters() is no longer needed; method signatures 
always preserve the type of methods on entry, as the symbol for the parameter 
(just as any other symbol) is no longer tied to a single type.
codegen.Label.Stack is now a misnomer; we’ll end up renaming it to Label.Frame, 
as it now also carries information about local variable types: what their types 
are, which ones belong to the same symbol, and which ones are temporary 
variables. This information is used to ensure that types are consistent at join 
points as well as for other codegen purposes.
Both LocalVariableTypesCalculator and CodeGenerator calculate static 
reachability precisely. We should not emit dead code that ASM replaces withNOP, 
NOP, ... ATHROW anymore (except in rest-of methods, where they’re by design).
Type.BOOLEAN is automatically treated as Type.INT for purposes of arithmetic 
operations, as they’re both represented by ints on stack and would be converted 
to int anyway.
Explicit optimistic flags in expressions exist no more. Rather, every 
expression that can be optimistic will have an assigned type. It’s the job of 
the CodeGenerator to figure out the appropriate intersection of an expression’s 
type, it’s coercing type (e.g. if it’s used as a subexpression in bitwise 
operation it’ll be coerced to int anyway), and its most pessimistic type (if 
its type can’t be more pessimistic, it isn’t optimistic) and decide whether to 
emit an optimistic INVOKEDYNAMIC call site or not.
Since Expression.isOptimistic() determines if the expression has a narrower 
type than its most pessimistic type, FunctionNode now uses a different method, 
FunctionNode.canBeDeoptimized() to determine if it contains optimistic program 
points.
BinaryNode has probably the most complex type calculation; it provides methods 
for calculating its type, as well as the widest required type of its operands. 
It also caches the result of the last type calculation as otherwise we’d have 
some nasty quadratic times in evaluation of types of deep node trees.
void return values (of POJO methods) are now converted to Undefined, instead of 
to null. As a future step, we’ll emit dyn:call for discarded expressions with 
void return type, and also declare bytecode methods for functions that don’t 
return an explicit value as void.
Finally, the code generator changes. I’m dedicating a separate section to these.

CodeGenerator doesn’t have any enterXxx() methods for expressions anymore, only 
for statements. This is to make the code more robust; an expression never 
occurs naked in JavaScript, it’s always inside a statement. Instead, there’s 
loadExpression() (former load()) that has a visitor that delegates to various 
private loadXxx() methods that contain the code equivalent to the old 
enterXxx() methods.
there’s a LoadScopeVar that encapsulates the loading of a scoped variable; it 
has a subclass named LoadFastScopeVar.
CodeGenerator internally loads expressions using TypeBounds. This is a pair of 
Type objects - a lower bound and an upper bound. E.g. when loading “a * b” if 
we know that “a” is int and “b” is long, then the type bound for the operands 
is [long, double] (“a” must be loaded at least as a long because “b” is long, 
and they should never be wider than double, ‘cause that’s what the 
multiplication coerces them to). However, if the “*” operator is optimistic and 
typed as long (can’t be int if it has a long operand), then we’ll attempt to 
load the operands as [int, double] and [long, double]; giving the chance to “a” 
to still load as optimistic int and only widening it after load. I think this 
all works as expected; binary operands widen the lower bounds of their peer 
operands and potentially of operations (put widening pressure from bottom up), 
while operations narrow the upper bounds of their operands (put narrowing 
pressure from top down). We’ll obviously need to stare at final code output of 
our benchmarks to capture some corner cases that might’ve fell through the 
cracks…
test-first loops (all of them except do-while) are now emitted with their test 
first, then their body, then an unconditional jump back to the test. I realize 
this seems less efficient than emitting the test at the end, but unfortunately 
now that we need to perform type calculations there’s no easy way to do them 
correctly if we first emit the body, and only afterwards emit the test. I might 
be able to overcome this later. I spoke with Rickard Bäckman, and he doesn’t 
think HotSpot cares one way or the other about structuring of the loop.
Normally, join points that widen a local variable value type kill the narrower 
value. Thus, at most one type of the value is live for a symbol at any time. An 
exception to this are local variables assigned in try blocks if they need to be 
widened before they can be used in catch blocks. In that case, code in the try 
block will perform widening as part of the assignment (as we can’t know when 
will we transfer control to the catch block), but it will keep using the 
narrower value. The narrower value is only killed (unless reassigned) at the 
end of the try block. E.g.:
   var i;
   try {
       thisMightThrow();
       i = 1; // stored into int slot, and also boxed into Object slot
       print(i); // uses int slot
       // int slot dies here
   } catch(e) {
       print(i); // must use i Object slot as i can be Undefined
   }
   print(i); // also uses i Object slot

That’s also the reason why IdentNode implements JoinPredecessor; it needs to be 
able to carry a conversion on the edge from within try block to catch block 
when it’s used as the target of an assignment. Since we can’t know when will 
such a control flow transfer take place, we conservatively move the conversion 
to the assignment.
OptimisticOperation assumed new responsibilities. It is now created holding the 
Expression and TypeBounds, and it is its sole centralized responsibility to 
figure out if the operation will indeed be optimistic or not, as an interaction 
between the expression’s type, its most pessimistic type, and the passed type 
bounds. It now owns the the dynamicGet(), dynamicSet(), dynamicCall(), 
replaceCompileTimeProperty(), and convertOptimisticReturnValue() methods that 
formerly belonged to the CodeGenerator class.
Thanks,
   Attila.

Re: Review request for JDK-8042118

Reply via email to