Re: lazy statics design notes

Maurizio Cimadamore Fri, 01 Mar 2019 01:49:52 -0800

Let me step back a bit.

From a technical perspective I don't think there's any issue. You getcondy semantics, which, as you say, is well defined w.r.t. cycles andlaziness etc.

From a language perspective I see an issue if we expect, as we said,that most people will just jump in and replace 'static' with'lazy-static'. That is gonna have all sorts of behavioral incompatibilities.

So, I was looking for a story for - when is it safe to replace anexisting 'static' with 'lazy-static' ? And the answer seems messy.

Think of it from an IDE perspective: when do I offer the refactoring to'lazy-static' to the user? If the answer ends up being along the linesof "when the initializer is a literal", I think most people won't evenbother, and the argument that, since everyone will opt-in into 'lazy'thus making <clinit> disappear won't, IMHO, hold very much.


Maurizio

On 01/03/2019 02:14, John Rose wrote:

On Feb 28, 2019, at 5:30 PM, Maurizio Cimadamore 
<maurizio.cimadam...@oracle.com> wrote:

Question: in which category of badness does this belong to?

class Foo {
    static int I = Foo.J;
    static int J = 2;

    public static void main(String[] args) {
       System.out.println(I); //prints 0
       System.out.println(J); //prints 2
    }
}

The language allows forward references to static fields, assuming you use a 
_qualified_ name (don't ask me why :-)).

I remember convincing myself long ago that this was semi-justified.

But I guess this is similar to having a static init calling a static method which returns 
the value of a non-yet initialized static. In any case, it feels that, if we condify 
these, we would change semantics, as suddenly "I" will be initialized to 2 too?

Yes, that seems likely.  A naive account of the semantics
of lazy statics would be that each lazy static is "really"
obtained via a static method which serves as its accessor.
This static method contains something like this:

    public static @Synthetic int get$I() {
       if (!I$ready)
          I = (…init expr here…);
       return I;
    }

The JMM makes extra demands here, of course, which only
the JVM (or var handles) can properly satisfy.

Also, beware of cycles!

class Foo {
    static int I = Foo.J;
    static int J = I;

    public static void main(String[] args) {
       System.out.println(I); //prints 0
       System.out.println(J); //prints 0
    }
}

I think a condified translation would throw or run into an endless loop?

Condy has a "no cycle" clause in its contract which we
can just reuse.  You get something like a SOE.

The naive semantics can model this by tracking
states more carefully on the I$ready variable.

To me, the crux of the issue is to draw a crisp line between things that can be 
condified and things that cannot (and have to fallback to <clinit>). But, the 
more I think, the less I'm convinced that such a line exist, or at least that a 
'meaningful' one exists.

Every class gets one phase-point which is executed with
mutual exclusion relative to any other access to the class.
Programmers use this phase-point in a zillion ways.
Surely there is no crisp characterization of all its uses.

What we *can* do is tell programmers that, if they
don't need a phase-point (or weren't even conscious
that they were executing one), they can use lazies
and get faster startup.

Since all method calls are potentially cycle-inducing (or 
forward-reference-inducing), lazy statics treatment cannot apply to an 
initializer that has a method call? And, you can have forward references or 
cycles through field access too (as above)... so, that seems to leave just 
simple constants, which doesn't seem to offer a lot of bang for the bucks? Am I 
missing something?

Any Turing-capable programming language is "potentially
cycle-inducing" and finding which programs cycle is undecidable.
We live with it.  I don't see how your reasoning applies here in
a special way to lazies.

This reminds me why the static variables have the restriction
that one can't refer to the other if it's later in the file.  It's not
because there is no possible use for this (hence the Foo.J escape)
but because most code benefits from a gentle static error check
that helps the programmer prove that the uninitialized values
of variables are not being used.

(For local variables, the DU/DA rules perform the same check,
more strictly.)

Lazy statics will benefit from the same gentle checks.  There
will  be some legitimate occasions to use the Foo.J escape
to create a lexical loop that you (the human programmer)
know won't turn into a dynamic loop.  Better yes, if you
don't use the Foo.J escape, then you know your lazies
won't loop.  So I think these features hang together.

Sidebar: as a VM machinery, I'd love to see something like DynamicValue - in 
fact I was talking to Sundar the other day that, if we had it, we could solve 
some thorny issues we have on Panama/jextract bytecode generation. But as a 
language construct, this feels shaky, but maybe that's just me?

Yes, I want DynamicValue sooner rather than later.

The potential looping doesn't bother me.

The syntax paradigm of stuffing everything into a field initializer
is annoyingly restrictive, but will be good enough to start with.

We can relax it later, I think, about the same time we do the
non-static version of the feature.  What I mean is that some
use cases for lazies has a use-site formulation, where (within
the class at least) each use of a lazy potentially comes with a
proposed value; there need not be a centralized point (the
def-site of the lazy) where the lazy's value is defined.  This is
true less for statics and more for non-statics.

Arguably, a use-site lazy mechanism is a wholly separate
language feature, but I think they should be lumped if possible.

And I think it's possible; that a use-site lazy generalizes
a def-site lazy in the sense that the central def-site
value (if any) is a default to be overridden in context
by the proposed use-site value (if any).

— John

P.S. I suppose there is rare legitimate code you might write where
the lexical dependency of statics has a loop, which the dynamic logic
of the program breaks.

(Note that programs have loops of all sorts, and we trust them
to break the loops dynamically even when we can't prove statically
that they terminate.)

Here's a silly example:

static final int I = (J_FIRST ? (computeIFromJ(Foo.J = computeJ()) : 
computeI()) ;
static int J = (J_FIRST ? Foo.J : computeJ());

This is what "static { }" blocks and blank finals are for.

The initial version of the Java language, which specified
the reference checks on statics in static initializers,
also supported the "Foo.J" oddity, and it *didn't* have
blank finals and their associated definite assignment rules.
Those extra rules augment the initialization checks
by allowing a static final to omit its initializer but
requiring an eventual initialization somewhere.

Re: lazy statics design notes

Reply via email to