As discussed at the meeting today, here's an outline of what a structural 
descriptor (pointer-based rather than string-based) in the constant pool might 
look like.

New constants representing types:

CONSTANT_PrimitiveType_info {
   u1 tag;
   u1 kind; // any valid 'atype' as specified for 'newarray', or T_VOID
}

CONSTANT_ArrayType_info {
   u1 tag;
   u2 component_type_index; // a type constant
}

(An alternative encoding of ArrayType would consist of an element type and a 
dimensions count. Putting each lower-dimension array type in its own constant 
potentially improves sharing, but adds overhead where the component types would 
otherwise go unused.)

A "type constant" is one of:
- Class
- ArrayType
- PrimitiveType

For historical reasons, a CONSTANT_Class may also refer to an array type, but 
this usage is discouraged.

New constant and attribute for method descriptors (in the spirit of NameAndType 
and BootstrapMethods, these are supplementary substructures of a MethodType):

CONSTANT_ParametersAndReturn_info {
   u1 tag;
   u2 return_type_index; // a type constant
   u2 parameter_types_attr_index; // an entry in TypeLists
}

TypeLists_attribute {
   u2 attribute_name_index; // Utf8 "TypeLists"
   u4 attribute_length;
   u2 num_type_lists;
   {   u2 num_types;
       u2 types[num_types]; // type constants
   }
}

(Two design goals constraining this solution: 1) re-use MethodType as the 
preferred representation of method descriptors, and 2) avoid introducing a new 
variable-length constant pool entry. Thus, two levels of indirection, one to 
provide a non-Utf8 constant for MethodType to reference, and another to offload 
the variable-length list to an attribute. We can drop these goals to reduce 
indirections.)

Changes in usage of constants:
- A MethodType can refer to a ParametersAndReturn (preferred) or a Utf8 method 
descriptor (legacy)
- The descriptor_index of a field may refer to a type constant (preferred) or a 
Utf8 field descriptor (legacy)
- The descriptor_index of a method may refer to a MethodType (preferred) or a 
Utf8 method descriptor (legacy)
- The descriptor of a NameAndType may refer to a type constant or MethodType 
(preferred), or a Utf8 field/method descriptor (legacy)
- ldc can refer any type constant
- checkcast/instanceof can refer to a Class or an ArrayType
- anewarray/multianewarray can refer to any type constant (and arguably the 
opcode names should be changed; newarray can be viewed as a compact shortcut, 
like iconst_0)

Changes in interpreting descriptors:
- A MethodType that refers to a Utf8 treats the descriptor as if it were 
expressed with a fresh ParametersAndReturn referencing fresh type constants
- A field or NameAndType that refers to a Utf8 field descriptor treats the 
descriptor as if it were expressed with a fresh type constant
- A method or NameAndType that refers to a Utf8 method descriptor treats the 
descriptor as if it were expressed with a fresh MethodType referencing fresh 
type constants
- Descriptor comparisons (during field/method resolution and method selection) 
continue to occur *without resolving* the referenced type constants. They are 
simply tree equality tests.
- Class loader constraints are generated for matching Class constants, 
identified by recurring through the trees.

(Note that implementations can continue to work with strings, if desired.)

Reply via email to