Some initial notes below attempting to flesh out what our two long-term options 
look like.

> On Jun 7, 2017, at 1:53 PM, John Rose <john.r.r...@oracle.com> wrote:

> Comparing these options in detail makes me comfortable with
> declaring that a CONSTANT_Class is *mainly* a file reference,
> and *also* an L-mode type.

Let me highlight this as the source of all these problems. Trying to make a 
single constant pool entry represent two different things is painful. It leads 
to confusion about the model, tortured language explaining basic things like 
what gets "returned" from resolution, attempts to explain away cases that don't 
follow the rules, bugs, etc.

That said, we must live with the legacy of years ago and make the best of it. 
Looking at the two viable strategies:

> 1. Wrap a new CP node (a "mode node") around the file-oriented C_Class node - 
> Q[Class["Foo"]]

Here's the syntax I would use, more or less:

CONSTANT_Class_info {
    u1 tag; // 7
    u2 name_index; // Utf8
}

CONSTANT_PrimitiveType_info {
   u1 tag; // 19
   u1 type_code; // 'Z'=90 or 4, 'C'=67 or 5, 'B'=66 or 8, 'S'=83 or 9
                 // 'I'=73 or 10, 'J'=74 or 11, 'F'=70 or 6, 'D'=68 or 7
}

CONSTANT_ClassType_info {
   u1 tag; // 20
   u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13
   u2 class_index; // Class
}

CONSTANT_ArrayType_info {
   u1 tag; // 21
   u2 component_index; // PrimitiveType, ClassType, ArrayType, or SpeciesType
}

CONSTANT_SpeciesType_info {
    u1 tag; //22
    u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13
    u2 class_index; // Class
    u2 enclosing_index; // ClassType or SpeciesType
    u2 typearg_count;
    u2 typeargs[typearg_count]; // PrimitiveType, ClassType, ArrayType, or 
SpeciesType
}

CONSTANT_MethodDescriptor_info {
   u1 tag; // 23
   u2 parameter_count;
   u2 parameter_descriptors[parameter_count]; // PrimitiveType, ClassType, 
ArrayType, or SpeciesType
   u2 return_descriptor; // PrimitiveType, ClassType, ArrayType, SpeciesType, 
or 0 (void)
}

CONSTANT_FieldDescriptor_info { // is this wrapper useful?
    u1 tag; // 24
    u2 type_index; // PrimitiveType, ClassType, ArrayType, or SpeciesType
}

(I thought about a CONSTANT_Type_info union rather than all these flavors of 
type constants, but it's not great because 1) constant pool entries already 
form a tagged union, so we don't need another union layer, and 2) 
CONSTANT_Class_info can also be used to represent types—once you've got 2 
flavors, might as well have 5+.)


> 2. Insert a new CP node inside the type-oriented C_Class node - 
> Class[Q["Foo"]] or Class[Q[File["Foo"]]]

Possible syntax for this:

CONSTANT_Class_info {
    u1 tag; // 7
    u2 name_index; // Utf8, PrimitiveDescriptor, ClassDescriptor, 
ArrayDescriptor, SpeciesDescriptor
}

CONSTANT_PrimitiveDescriptor_info {
   u1 tag; // 19
   u1 type_code; // 'Z'=90 or 4, 'C'=67 or 5, 'B'=66 or 8, 'S'=83 or 9
                 // 'I'=73 or 10, 'J'=74 or 11, 'F'=70 or 6, 'D'=68 or 7
}

CONSTANT_ClassDescriptor_info {
   u1 tag; // 20
   u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13
   u2 class_index; // ClassFile
}

CONSTANT_ClassFile_info {
   u1 tag; // 25
   u2 class_index; // Utf8
}

CONSTANT_ArrayDescriptor_info {
   u1 tag; // 21
   u2 component_index; // PrimitiveDescriptor, ClassDescriptor, 
ArrayDescriptor, or SpeciesDescriptor
}

CONSTANT_SpeciesDescriptor_info {
    u1 tag; //22
    u1 mode_code; // 'L'=76 or 12, 'Q'=81 or 13
    u2 class_index; // ClassFile
    u2 enclosing_index; // ClassDescriptor or SpeciesDescriptor
    u2 typearg_count;
    u2 typeargs[typearg_count]; // PrimitiveDescriptor, ClassDescriptor, 
ArrayDescriptor, or SpeciesDescriptor
}

CONSTANT_MethodDescriptor_info {
   u1 tag; // 23
   u2 parameter_count;
   u2 parameter_descriptors[parameter_count]; // PrimitiveDescriptor, 
ClassDescriptor, ArrayDescriptor, or SpeciesDescriptor
   u2 return_descriptor; // PrimitiveDescriptor, ClassDescriptor, 
ArrayDescriptor, SpeciesDescriptor, or 0 (void)
}

CONSTANT_FieldDescriptor_info { // is this wrapper useful?
    u1 tag; // 24
    u2 type_index; // PrimitiveDescriptor, ClassDescriptor, ArrayDescriptor, or 
SpeciesDescriptor
}

--------

Here's an overview of spec changes, assuming one of the sets of syntactic 
changes above. As I look at this, both approaches seem mostly fine. Option (1) 
has messier rules for resolution, because it has to deal with the duality of 
CONSTANT_Class. Option (2) has messier treatment of this_class, in exchange for 
eliminating the duality of CONSTANT_Class.

The rules about where types can appear can be additive (new constants allowed 
in certain places) or negative (certain kinds of CONSTANT_Class disallowed in 
certain places), but either way, you've *mostly* got to touch all of the same 
places.


Syntax

Need to describe where certain kinds of types or class references can appear. 
In option (1), some of this can be enforced to some extent by limiting the 
types of constants allowed in certain places. But, generally, both option (1) 
and option (2) will need informal format or static constraints (4.8, 4.9.1) 
that disallow certain structures that encode certain kinds of types.

Descriptors of fields/methods can be expressed as strings or 
MethodDescriptor/FieldDescriptor structures. CONSTANT_NameAndType, 
CONSTANT_MethodType, LocalVariableTable, and annotations allow descriptor_index 
to point to any of these (prohibiting method or field descriptors as 
appropriate). "The same descriptor" is defined as a recursive comparison of the 
parts. It does not involve resolution or loading. It allows a string descriptor 
to possibly match a structured MethodDescriptor/FieldDescriptor. (This 
definition applies, among other things, to the prohibition of duplicate 
field/method declarations.)

A (maybe) comprehensive list of where classes/types can appear:

- Simple class references (CONSTANT_Class with a simple class name for (1), 
CONSTANT_Class representing a class type for (2)):
ClassFile.this_type
InnerClasses
EnclosingMethod

(All we want is the class, but for compatibility a CONSTANT_Class must be 
allowed here, so (2) takes the position that these are encoded as types.)

- Any class type (CONSTANT_Class with a simple class name or 
CONSTANT_ClassType/CONSTANT_SpeciesType for (1), CONSTANT_Class representing a 
class type for (2)):
ClassFile.super_class
Fieldref.class_index
Methodref.class_index
InterfaceMethodref.class_index

- Reference class type (CONSTANT_Class or 
CONSTANT_ClassType/CONSTANT_SpeciesType representing a reference class type for 
(1), CONSTANT_Class representing a reference class type for (2)):
new
Code.exception_table.catch_type
Exceptions.exception_index_table

- Array type (CONSTANT_Class representing an array type or CONSTANT_ArrayType 
for (1), CONSTANT_Class representing an array type for (2)):
multianewarray

- Reference type (CONSTANT_Class, CONSTANT_ArrayType, or 
CONSTANT_ClassType/CONSTANT_SpeciesType repesenting a reference class type for 
(1), CONSTANT_Class representing a reference type for (2)):
instanceof
checkcast

- Any type (CONSTANT_Class, CONSTANT_ArrayType, CONSTANT_ClassType, 
CONSTANT_SpeciesType, or CONSTANT_PrimitiveType for (1), CONSTANT_Class for 
(2)):
anewarray
ldc
verification_type_info.Object_variable_info
BootstrapMethods.bootstrap_arguments


Verification

- Types and descriptors of all forms can be parsed to verification types 
without any resolution or loading. (Many of the changes in the current value 
classes spec are there to support this.)


Resolution

For (1), a CONSTANT_Class can be "resolved" or "resolved as a type". Plain 
resolution is only allowed where we've asserted that the name is not an array 
type descriptor. It produces a loaded class. In contexts where type structures 
can appear, if a CONSTANT_Class is also allowed, resolving the type implicitly 
means the CONSTANT_Class is "resolved as a type", which will treat it as a 
ClassType with mode 'L'. Resolution of a type produces a java.lang.Class (or 
some equivalent internal representation).

For (2), a CONSTANT_ClassFile is always resolved to a loaded class. A 
CONSTANT_Class is always resolved to a type.

In either case, descriptors are not resolved. (This includes all the 
type-related structures called "descriptors" in (2). Though the implementation 
might choose to lazily cache some resolved types with them.)


Semantics

- Various cleanups to ensure that, downstream from resolution, we're talking 
about "types" rather than "classes and interfaces". (Again, much of this is 
already in the value classes spec.)


—Dan

Reply via email to