I’m trying to plan out changes to our type metadata record formats for ABI
stability. I’ll start by looking at the current situation and then make
suggestions about things we ought to change. We want to settle on a design that
leaves room for future expansion and runtime changes, and still allows
efficient access to the most frequently-accessed parts of metadata. I’ll be
looking exclusively at metadata records themselves for this message, leaving
other data structures for separate scrutiny. I'd appreciate all your feedback.
ABI concerns for type metadata records
These are the primary ways in which the compiler and runtime are exposed to
direct binary layout of metadata records:
The compiler generates metadata records for some types, either as static data
that’s expected to be directly usable as a type metadata record, or a pattern
for a metadata record that’s fed as input to an instantiation process.
The compiler generates code that interacts with metadata records. It generates
metadata accesses to form the metadata pointer for a specific type. It also
projects information out of metadata records of a known kind. This can take the
form either of runtime calls or of direct projection into known offsets inside
the metadata record, making a tradeoff between abstracting binary layout
details and performance of generated code.
1. Compiler-generated metadata records
For concrete struct and enum types, the compiler generates a metadata record as
static data. If the type has known layout (no resilient fields), then the
metadata record is expected to be usable as valid metadata as-is. If the type
has unknown layout, the metadata record requires one-time initialization to
become valid metadata.
For concrete class types, the compiler generates a metadata record as static
data. If the class is @objc, a pointer into this metadata record is also
exported as the Objective-C class symbol for Objective-C binaries to link
against. Class metadata records always require one-time initialization to
become valid metadata because of, at minimum, Objective-C class realization.
(On platforms without Objective-C interop, a concrete class with a fully
concrete, non-resilient ancestry and no resilient fields could be made usable
as-is as valid metadata.)
For generic types, the compiler generates a generic metadata pattern. The
swift_getGenericMetadata runtime function takes a pointer to a generic metadata
pattern and a list of generic arguments and produces a valid metadata record
for the generic type instantiated with those arguments.
For Clang-imported types, the compiler generates a metadata candidate. This
looks like a metadata record, with an added uniquing prefix. The
swift_getForeignMetadata runtime function picks a metadata record to be the
canonical metadata pointer, and performs one-time initialization if required.
2. Metadata code generation
ABI concerns for all metadata
The compiler makes the following assumptions about all type metadata records:
Every formal Swift type has a corresponding type metadata record at a unique
address, so pointer identity can be used to test type identity.
metadataPointerA == metadataPointerB if and only if metadataPointerA and
metadataPointerB represent the exact same formal type.
The value witness table can be loaded at fixed offset -1*sizeof(Int) from the
address point.
Classes
The compiler accesses concrete class metadata by calling
swift_getInitializedObjCClass to perform one-time initialization of the
metadata. Generic class metadata is instantiated usingswift_getGenericMetadata,
and the template is responsible for initialization of the generated class
object.
A subclass’s metadata record serves as a physical subtype of its parent class
metadata record. Any projection pattern that works on the parent class metadata
should also work equivalently on the subclass metadata, allowing for overrides
of methods and other entries where the subclass customizes behavior.
For all classes, the compiler expects access to the following fields:
Information Projection strategy
Destructor fixed offset -2*sizeof(Int)
ObjC matter fixed offset (0...4)*sizeof(Int)
Nominal type descriptor [1]
Generic arguments [1]
Stored property offsets [1]
[1] For classes with fully-fragile ancestry, generic arguments and stored
property offsets can be loaded at a fixed offset determined by the compiler. If
the class has any resilient ancestry, such as a base class from another module,
then the base offset into the subclass’s own entries must be determined at
instantiation time and loaded indirectly.
For classes with fully-fragile ancestry, the compiler can also emit fixed
offset accesses for vtable entries.
The metadata currently includes field offsets for all stored properties, but
this should be changed to include only the fields with offsets that are
dependent on the struct’s generic arguments.
Structs
The compiler accesses concrete, known-layout struct metadata by direct
reference to the global symbol. If the struct contains resilient fields, the
metadata is accessed through a runtime call for one-time initialization.
Generic struct metadata is instantiatied using swift_getGenericMetadata.
For all structs, the compiler expects access to the following fields:
Information Projection strategy
Nominal type descriptor fixed offset, compiler-determined
Generic arguments fixed offset, compiler-determined
Stored property offsets fixed offset, compiler-determined
The metadata currently includes field offsets for all stored properties, but
this should be changed to include only the fields with offsets that are
dependent on the struct’s generic arguments.
Enums
The compiler accesses concrete, known-layout enum metadata by direct reference
to the global symbol. If the enum contains resilient payloads, the metadata is
accessed through a runtime call for one-time initialization. Generic enum
metadata is instantiatied using swift_getGenericMetadata.
For all enums, the compiler expects access to the following fields:
Information Projection strategy
Nominal type descriptor fixed offset, compiler-determined
Generic arguments fixed offset, compiler-determined
Tuples
Tuple metadata records are accessed by calling the swift_getTupleTypeMetadata*
runtime functions.
The compiler expects access to the following fields:
Information Projection strategy
Element offsets fixed offset
If we wanted to be able to satisfy type metadata requirements for element types
from tuple metadata, we could also provide access to:
Information Projection strategy
Element types TBD
Functions
Function metadata records are accessed by calling the
swift_get*FunctionTypeMetadata* runtime functions.
The compiler does not currently emit any projections into function type
metadata.
If we wanted to be able to satisfy type metadata requirements for input or
output types from function metadata, we could also provide access to:
Information Projection strategy
Return type TBD
Argument types TBD
Existentials, existential metatypes
Existential metadata records are accessed by calling the
swift_getExistentialTypeMetadata* runtime functions.
The compiler does not currently generate any code that reaches into existential
metadata records.
If we wanted to be able to satisfy type metadata requirements for input or
output types from function metadata, we could also provide access to:
Information Projection strategy
Generic signature TBD
Metatypes
Metatype metadata records are accessed by calling the swift_getMetatypeMetadata
function.
The compiler does not currently generate any code that reaches into metatype
metadata records.
If we wanted to be able to satisfy type metadata requirements for the instance
type from metatype metadata, we could also provide access to:
Information Projection strategy
Instance type TBD
Objective-C wrappers
Class objects for natively-Objective-C classes are not Swift metadata by
themselves, so need a wrapper metadata record. This wrapper metadata is only
produced by the Swift runtime (and could be produced more efficiently by future
cooperation with the ObjC runtime). The compiler accesses Objective-C class
metadata by calling the swift_getObjCClassMetadata function, which may return
either a wrapper or pass through the input class object if it is already valid
metadata.
The compiler generates code against the following:
Information Projection strategy
Class object runtime call swift_getObjCClassFromMetadata
Core Foundation class
Foreign class metadata records are generated by Swift’s Clang importer for
imported Core Foundation class types. The records require runtime
canonicalization to determine which metadata record uniquely identifies the
type, since there is no home Swift module for the CF type. The compiler does
not currently generate code against any fields of the metadata.
Recommended changes
Abstract away fixed offsets across modules and for structural metadata
Anything we lock down fixed offsets for in the ABI should justify itself by
being necessary for performance or code size reasons. Unspecialized code relies
heavily on:
the value witness table
stored property offsets
generic arguments
and these are the things that nominal type metadata currently makes directly
available, for the most part. Anything else we emit direct access to (such as
nominal type descriptors) should definitely be abstracted behind a runtime
call. For structural type metadata (tuples, functions, existentials,
metatypes), we already do so for most of the layout details, with the one
exception of tuple element offsets, which are exposed for performance of
unspecialized code.
For concrete nominal types, the compiler also currently generates metadata
records that are expected to be directly usable as metadata. This is a nice
code size optimization that’d be unfortunate to give up. At least for internal
or private types, the emitted metadata format and all of the code that ought to
be reaching into it are together in one resilience boundary, so one could say
that the precise metadata record format for internal value types is a private
contract of the compiler. However, even for public fragile types, it would be
beneficial still to be able to directly export the symbol as usable metadata.
For the performance-sensitive field offset and generic argument vectors, we
could nonetheless avoid hard-coding offsets in the compiler. We could still
guarantee that the generic arguments followed by (fragile) stored property
offsets are stored contiguously, and abstract the base offset to the contiguous
data structure. Possibilities for accessing the base offset include:
Require a runtime call to get it
Place the base offset somewhere else in the metadata
Export the base offset as a separate symbol (possibly an absolute symbol, if
it’s knowable at compile time for the home module, or a global variable, if it
requires runtime computation).
For classes with resilient ancestry, we already must have an approach that
allows the offset to be computed at instantiation time and efficiently accessed
in unspecialized code afterward. Technique (3) is proven by the Objective-C
runtime and should be sufficient for our needs. It would impose a two- or
three-instruction cost per type to load the base offset, which would sit on the
dependency chain for any generic type or field offset loads based on that base
offset, but would give us the flexibility to add new information to metadata
records in future compilers.
Make most metadata kinds private to the runtime
The compiler does not emit any code that looks at metadata kinds for projection
purposes. It does, however, use metadata kinds when it builds metadata records
for nominal types. However, the metadata kind is already redundantly coded in
the nominal type descriptor, so we could conceivably reduce the exposure of
metadata kinds to a single “value type” kind (in addition to the
“isa-pointer-in-the-kind-field-means-class” kind, imposed by ObjC
compatibility).
-Joe
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev