Declared patterns -- translation and reflection

Brian Goetz Tue, 29 Mar 2022 14:01:41 -0700

Time to take a peek ahead at _declared patterns_. Declared patternscome in three varieties -- deconstruction patterns, static patterns, andinstance patterns (corresponding to constructors, static methods, andinstance methods.) I'm going to start with deconstruction patterns, butthe basic game is the same for all three.

Ignoring the trivial details, a deconstruction pattern looks like a"constructor in reverse":


```{.java}
class Point {
    int x, y;

    Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    deconstructor(int x, int y) {
        x = this.x;
        y = this.y;
    }
}
```

Deconstruction patterns share the weird behaviors that constructors havein that they are instance members, but are not inherited, and thatrather having names, they are accessed via the class name.

Deconstruction patterns differ from static/instance patterns in thatthey are by definition total; they cannot fail to match. (This is asomewhat arbitrary simplification in the object model, but a reasonableone.) They also cannot have any input parameters, other than the receiver.

Patterns differ from their ctor/method counterparts in that they havewhat appear to be _two_ argument lists; a parameter list (like ctors andmethods), and a _binding_ list. The parameter list is often empty (withthe receiver as the match target). The binding list can be thought of asa "conditional multiple return". That they may return multiple values(and, for partial patterns, can return no values at all when they don'tmatch) presents a challenge for translation to classfiles, and for thereflection model.


#### Translation to methods

Patterns contain imperative code, so surely we want to translate them tomethods in some way. The pattern input parameters map cleanly to methodparameters.

The pattern bindings need to tunneled, somehow, through the methodreturn (or some other mechanism). For our deconstructor, we mighttranslate as:


    PatternCarrier <dtor>()

(where the method applies the pattern, and PatternCarrier wraps andprovides access to the bindings) or


    PatternObject <dtor>()

(where PatternObject provides indirection to behavior to invoke thepattern, which in turn returns the carrier.)

With either of these approaches, though, the pattern name is a problem,because patterns can be overloaded on their _bindings_, but both ofthese return types are insensitive to bindings.

It is useful to characterize the "shape" of a pattern with a MethodType,where the parameters of the MethodType are the binding types. (Thereturn type is less constrained, but it is sometimes useful to use thereturn type of the MethodType for the required type of the pattern.) Call this the "descriptor" of the pattern.

If we do this, we can use some name mangling to encode the descriptor inthe method name:


    PatternCarrier name$mangle()

The mangling has to be stable across compilations with respect to anysource- and binary-compatible changes to the pattern declaration. Onemangling that works quite well is to use the "symbolic-freedom encoding"of the erasure of the pattern descriptor. Because the erasure of thedescriptor is exactly as stable as any other method signature derivedfrom source declarations, it will have the desired binary compatibilityproperties, overriding will work as expected, etc.


#### Return value

In an earlier design, we used a pattern object (which was a bundle ofmethod handles) as the return value of the pattern. This enabled clientsto invoke these via condy and bind method handles into the constant poolfor deconstruction and static patterns.

Either way, we make use of some sort of carrier object to carry thebindings from the pattern to the client; either we return the carrierfrom the pattern method, or there is a method on the pattern object thatwe invoke to get a carrier. We have a few preferences about thecarrier; we'd like to be able to late-bind to the actual implementation(i.e., we don't want to freeze the name of a carrier class in the methoddescriptor), and at least for records, we'd like to let the recordinstance itself be the carrier (since it is immutable and we can justinvoke the accessors to get the bindings.)


#### Carriers

As part of the work on template strings, Jim has put back some code thatwas originally written for the purpose of translating patterns, called"carriers". There are methods / bootstraps that take a MethodType andreturn method handles to (a) encode values of those types into an opaquecarrier object and (b) pull individual values out of a carrier. Thismeans that the choice of carrier object can be deferred to runtime, aslong as both the bundling and unbundling methods handles agree on thecarrier form.

The choice of carrier is largely a footprint/specificity tradeoff. Onecould imagine a carrier class per shape, or a single carrier class thatwraps an Object[], or caching some number of common shapes (three intsand two refs). This sort of tuning should be separate from the protocolencoded in the bytecode of the pattern method and its clients.

The pattern matching runtime will provide some condy bootstraps whichwrap the Carriers behavior.

Since at least some patterns are conditional, we have to have a way toencode failure into the protocol. For a partial pattern, we can use aB2 carrier and use null to encode failure to match; for a total pattern,we can use a B3 carrier.


#### Proposed encoding

Earlier explorations did a lot of work to preserve the optimization thata match target can be its own carrier. But further analysis revealsthat the cost of doing so for other than records is pretty substantialand works against the model of a pattern declaration being an imperativebody of code that runs at match time. So for record patterns, we can"inline" them by using `instanceof` as the applicability test andaccessors for extraction, and for all other patterns, go through thecarrier runtime.


This allows us to encode pattern methods as

    Object name$mangle(ARGS)

and have the pattern method do the match and return a carrier (or null),using the carrier object that the carrier runtime associates with thepattern descriptor. And clients can take apart the result again usingthe extraction logic that the carrier runtime associates with thepattern descriptor.

This also means that instance patterns "just work" because virtualdispatch selects the right implementation for us automatically, and allimplementations that can be overrides will also implicitly agree on theencoding.

Because patterns are methods, we can take advantage of all theaffordances of methods. We can use access bits to control accessibilityin the obvious way; we can use the attributes that carry annotations,method parameter metadata, and generics signatures to carry informationabout the pattern declaration and its parameters. What's missing is aplace to put metadata for the *bindings*, and to record the fact thatthis is a pattern implementation and not an ordinary method. So, we addthe following attribute on pattern methods:


    Pattern {
        u2 attr_name;
        u4 attr_length;
        u2 patternFlags; // bitmask
        u2 patternName;  // index of UTF8 constant

u2 patternDescr; // index of MethodType (or alternately UTF8)constant

        u2 attributes_count;
        attribute_info attributes[attributes_count];
    }

This says that "this method is a pattern", reifies the name of thepattern (patternName), reifies the pattern descriptor (patternDescr)which encodes the types of the bindings as a method descriptor orMethodType, and has attributes which can carry annotations, parametermetadata, and signature metadata for the bindings. The existingattributes (e.g. Signature, ParameterNames, RVAA) can be reused as is,with the interpretation that this is the signature (or names, or annos)of the *bindings*, not the input parameters. Flags can carry thingslike "deconstructor pattern" or "partial pattern" as needed.


## Reflection

We already have a sensible base class in the reflection library forreflecting patterns: Executable. All of the methods on Executable makesense for patterns, including Object as the return type. If the patternis reflectively invoked, it will return null (for no match) or anObject[]; this Object[] can be thought of as the boxing of the carrier. Since the method return type is Object, this is an entirely reasonableinterpretation.

We need some additional methods to describe the bindings, so we wouldhave a subtype of Executable for Pattern, with methods likegetBindings(), getAnnotatedBindings(), getGenericBindings(),isDeconstructor(), isPartial(), etc.


## Summary

This design borrows from previous rounds, but makes a number ofsimplifications.

- The bindings of a pattern are captured in a MethodType, called the_pattern descriptor_. The parameters of the pattern descriptor are thetypes of the bindings; the return type is the minimal type that willmatch the pattern (but is not as important as the bindings.) - Patterns are translated as methods whose names are derived,deterministically, from the name of the pattern and the erasure of thepattern descriptor. These are called pattern methods. Pattern methodstake as parameters the input parameters of the pattern, and return Object. - The returned object is an opaque carrier. Null means the patterndidn't match. A non-null value is the carrier type (from the carrierruntime) which is derived from the pattern descriptor. - Pattern methods are not directly invocable from the source language;they are invoked indirectly through pattern matching, or reflection. - Generated code invokes the pattern method and interprets thereturned value according to the protocol, using MHs from the patternruntime to access the bindings. - Pattern methods have a Pattern attribute, which captures informationabout the pattern as a whole (is a total/partial, a deconstructor, etc)and parameter-related attributes which describe the bindings. - Patterns are reflected through a new subtype of Executable, whichexposes new methods to reflect over bindings. - When invoking a pattern method reflectively, the carrier is boxed toan Object[].

Declared patterns -- translation and reflection

Reply via email to