As discussed at the meeting today, here's an outline of what a structural
descriptor (pointer-based rather than string-based) in the constant pool might
look like.
New constants representing types:
CONSTANT_PrimitiveType_info {
u1 tag;
u1 kind; // any valid 'atype' as specified for 'newarray', or T_VOID
}
CONSTANT_ArrayType_info {
u1 tag;
u2 component_type_index; // a type constant
}
(An alternative encoding of ArrayType would consist of an element type and a
dimensions count. Putting each lower-dimension array type in its own constant
potentially improves sharing, but adds overhead where the component types would
otherwise go unused.)
A "type constant" is one of:
- Class
- ArrayType
- PrimitiveType
For historical reasons, a CONSTANT_Class may also refer to an array type, but
this usage is discouraged.
New constant and attribute for method descriptors (in the spirit of NameAndType
and BootstrapMethods, these are supplementary substructures of a MethodType):
CONSTANT_ParametersAndReturn_info {
u1 tag;
u2 return_type_index; // a type constant
u2 parameter_types_attr_index; // an entry in TypeLists
}
TypeLists_attribute {
u2 attribute_name_index; // Utf8 "TypeLists"
u4 attribute_length;
u2 num_type_lists;
{ u2 num_types;
u2 types[num_types]; // type constants
}
}
(Two design goals constraining this solution: 1) re-use MethodType as the
preferred representation of method descriptors, and 2) avoid introducing a new
variable-length constant pool entry. Thus, two levels of indirection, one to
provide a non-Utf8 constant for MethodType to reference, and another to offload
the variable-length list to an attribute. We can drop these goals to reduce
indirections.)
Changes in usage of constants:
- A MethodType can refer to a ParametersAndReturn (preferred) or a Utf8 method
descriptor (legacy)
- The descriptor_index of a field may refer to a type constant (preferred) or a
Utf8 field descriptor (legacy)
- The descriptor_index of a method may refer to a MethodType (preferred) or a
Utf8 method descriptor (legacy)
- The descriptor of a NameAndType may refer to a type constant or MethodType
(preferred), or a Utf8 field/method descriptor (legacy)
- ldc can refer any type constant
- checkcast/instanceof can refer to a Class or an ArrayType
- anewarray/multianewarray can refer to any type constant (and arguably the
opcode names should be changed; newarray can be viewed as a compact shortcut,
like iconst_0)
Changes in interpreting descriptors:
- A MethodType that refers to a Utf8 treats the descriptor as if it were
expressed with a fresh ParametersAndReturn referencing fresh type constants
- A field or NameAndType that refers to a Utf8 field descriptor treats the
descriptor as if it were expressed with a fresh type constant
- A method or NameAndType that refers to a Utf8 method descriptor treats the
descriptor as if it were expressed with a fresh MethodType referencing fresh
type constants
- Descriptor comparisons (during field/method resolution and method selection)
continue to occur *without resolving* the referenced type constants. They are
simply tree equality tests.
- Class loader constraints are generated for matching Class constants,
identified by recurring through the trees.
(Note that implementations can continue to work with strings, if desired.)