This is to initiate a discussion of (maybe) adding some form of constant
propagation into Nashorn’s compiler pipeline. We currently do constant folding
as a early phase (Lower), but we don’t have constant propagation. Michael Haupt
expressed interest to tackle this, and I thought it’d be best to work out the
design kinks here in the open so others can also chime in if they want.
Scope
=====
We’re talking about intra-function constant propagation over non-scoped local
variables only. That should not be too hard to do. It could be somewhat similar
to how LocalVariableTypesCalculator (LVTC) works, actually. LVTC currently does
def-use tracking of type information from assignment to IdentNode objects and
propagates the type information into use of IdentNode objects, through their
shared Symbol. This is similar, except it could be used to propagate constant
values.
Of course, if we wanted, later we could even extend it to something else, e.g.
range analysis, or really any other invariants. It’s just a matter of how much
work are we comfortable having the compiler do.
Implementation notes
=================
By the way of implementation, we’d need a top-down (meaning, it operates on
enterXxx() methods; possibly multipass for loops, same as LVTC) pass that keeps
a Map<Symbol, LiteralNode> and builds another Map<IdentNode, LiteralNode>.
After it’s done, we’d need another pass, this time bottom-up (operating on
leaveXxx() methods), replacing every IdentNode identified as having constant
value with a LiteralNode instead.
We should actually be even smarter about it, and have the top-down pass do
another round of constant folding, using a generic Map<Expression, LiteralNode>
instead of Map<IdentNode, LiteralNode> and evaluate expressions on the top-down
pass and substitute already known constants opportunistically attempting to
evaluate other expressions into constants. E.g. it could then collapse:
var a = 3; var b = 2; var c = a + b;
into
var c = 5;
(and eliminate a and b if they aren’t used elsewhere)
We’d still need the bottom-up pass to perform the final replacement of folded
constants. As such, the replacement of IdentNode with a LiteralNode would just
become a special case of constant folding after propagation :-)
An important element of the solution is to ensure that if at control flow join
points a Symbol arrives containing different possible constant values on
different branches (or a constant and a non-constant), then its const-ness has
to be erased.
Function literals
============
Also, we should figure out a way to represent a “literal” for a function. We
currently don’t do that. Constant propagation for function objects isn’t
trivial as we can’t just replace every occurrence with a new FunctionNode; on
the other hand it’d be nice to have as invocations of constant-propagated
(proven never to change) functions could be performed without a guard, e.g.
function () {
function f() { … }
…
f(); <— “f” can be proven to always point to the function above.
}
Where to put the pass in the pipeline
============================
This pass should likely be done earlier than LVTC, so that we don’t calculate
local types for variable reads that were replaced with constants. If all reads
of a local variable are replaced with a constant, Nashorn can even completely
eliminate the local variable (it currently eliminates never-read locals).
Attila.