> On Jan 31, 2018, at 2:50 PM, Brian Goetz <brian.go...@oracle.com> wrote:
> 
> OK, so the goal is clearly to *prohibit* cyclic constants by enforcing a 
> depth constraint on condy-in-condy nesting.  I like this goal much better, 
> and I see how the solution addresses it.
> 
> Still, I have to wonder whether its worth the complexity of capturing it 
> explicitly with a new (stackmap-like) attribute, which must be specified, and 
> gives us more chances for the classfile to be out of sync with itself, for 
> the goal of preventing something that can't usefully happen anyway -- as we 
> discussed previously, trying to actually resolve a cyclic constant will 
> likely result in SOE, so such classfiles are pretty much useless and 
> therefore not likely to spread very far.

This is a reasonable critique. Another critique is that the cost of tracking 
down and parsing an extra attribute will undo a lot of the performance benefit 
of having it, perhaps even causing a net slowdown.

As we've discussed the idea a little more, my inclination (raised in the 
meeting this week) is to give up on trying to add structural restrictions on 
constant pools designed to facilitate cycle detection. It seems like we can, in 
fact, detect cycles without extra guidance, and with minimal overhead in most 
cases. Some details below.

Big picture, the most attractive path forward seems to be:
- In 11, a CONSTANT_Dynamic with a cycle is an "error", reported via SOE at 
resolution time
- When we introduce lazy condy resolution, we move the error detection to class 
load time, with a check that we're comfortable isn't disrupting startup time
- As other potentially-cyclical constants are introduced, we use the same load 
time checks
- We never require extra information from the compiler, or reject classes that 
are hard to prove cycle-free

One nice thing about that last point is that we don't have to worry about 
incompatible support for "hard" classes in different releases.

I'd love to hear about some experiments with this. We're talking about 
performance impact, and all discussion to this point has been hypothetical. 
Would be good to get some concrete feedback.

—Dan

-----

Details:

We started with a rule for format checking (JVMS 4.8) that says it's illegal 
for a CONSTANT_Dynamic (or, in the future, other potentially-cyclical 
constants) to refer, via static arguments, to itself—directly or indirectly. 
But we quickly decided this sounded expensive, and we didn't want a heavy 
validation step to mess with startup time.

Hence, we explored various ways to prohibit otherwise-valid classes that would 
be hard to prove cycle-free. Something like the RecursiveConstants attributes 
seems to be the best option in that space.

However, scrutinizing the original assumption, it seems reasonable that we can 
optimize the check so that it's extremely lightweight when references are 
"backward" (to previous entries in the constant pool), and only performs more 
expensive analysis when references are "forward". Most classes should naturally 
use backward references exclusively, since entries tend to be generated 
bottom-up. (In theory. If there are counter examples produced by any mainstream 
bytecode generators, that would be good to know.)

So a dumb (worst case O(n^2)) but fast (very low constant cost) check could 
look something like:

void checkCP(int i) {
    switch (tag(i)) {
        case CONSTANT_Dynamic:
            ...
            for (int j : condyStaticArgs(i)) {
                if (j >= i && tag(j) == CONSTANT_Dynamic) {
                    assertUnreachable(i, j); // can't reach i from j
                }
            }
            ...
        ...
    }
}

It's already necessary to check that constant pool references are well-typed, 
so everything up to 'assertUnreachable' here is practically free*. The 
'assertUnreachable' call might be somewhat expensive, but it will rarely be 
called in practice (and even then, we don't expect many structures to be 
particularly deep).

(*In the particular case of CONSTANT_Dynamic, there's an indirection through 
BootstrapMethods. If a class heavily shares BootstrapMethods entries, then 
there's a new cost here, iterating over the same static arguments multiple 
times. Some optimization could be done to speed that up. For example, if there 
are no forward references on the first pass, there will be no forward 
references on later passes, when i is a larger number.)

A good worst-case test would be a constant pool without cycles, built entirely 
out of forward references, consisting of lots of deep CONSTANT_Dynamic trees. 
Needless to say, we don't expect to encounter classes like this in the wild. 
But, if we do, the big benefit here is that, while they'll be a little slower 
to load, they'll still work.

Reply via email to