Time to take a peek ahead at _declared patterns_.  Declared patterns come in three varieties -- deconstruction patterns, static patterns, and instance patterns (corresponding to constructors, static methods, and instance methods.)  I'm going to start with deconstruction patterns, but the basic game is the same for all three.

Ignoring the trivial details, a deconstruction pattern looks like a "constructor in reverse":

```{.java}
class Point {
    int x, y;

    Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    deconstructor(int x, int y) {
        x = this.x;
        y = this.y;
    }
}
```

Deconstruction patterns share the weird behaviors that constructors have in that they are instance members, but are not inherited, and that rather having names, they are accessed via the class name.

Deconstruction patterns differ from static/instance patterns in that they are by definition total; they cannot fail to match. (This is a somewhat arbitrary simplification in the object model, but a reasonable one.)  They also cannot have any input parameters, other than the receiver.

Patterns differ from their ctor/method counterparts in that they have what appear to be _two_ argument lists; a parameter list (like ctors and methods), and a _binding_ list.  The parameter list is often empty (with the receiver as the match target). The binding list can be thought of as a "conditional multiple return".  That they may return multiple values (and, for partial patterns, can return no values at all when they don't match) presents a challenge for translation to classfiles, and for the reflection model.

#### Translation to methods

Patterns contain imperative code, so surely we want to translate them to methods in some way.  The pattern input parameters map cleanly to method parameters.

The pattern bindings need to tunneled, somehow, through the method return (or some other mechanism).  For our deconstructor, we might translate as:

    PatternCarrier <dtor>()

(where the method applies the pattern, and PatternCarrier wraps and provides access to the bindings) or

    PatternObject <dtor>()

(where PatternObject provides indirection to behavior to invoke the pattern, which in turn returns the carrier.)

With either of these approaches, though, the pattern name is a problem, because patterns can be overloaded on their _bindings_, but both of these return types are insensitive to bindings.

It is useful to characterize the "shape" of a pattern with a MethodType, where the parameters of the MethodType are the binding types.  (The return type is less constrained, but it is sometimes useful to use the return type of the MethodType for the required type of the pattern.)  Call this the "descriptor" of the pattern.

If we do this, we can use some name mangling to encode the descriptor in the method name:

    PatternCarrier name$mangle()

The mangling has to be stable across compilations with respect to any source- and binary-compatible changes to the pattern declaration.  One mangling that works quite well is to use the "symbolic-freedom encoding" of the erasure of the pattern descriptor.  Because the erasure of the descriptor is exactly as stable as any other method signature derived from source declarations, it will have the desired binary compatibility properties, overriding will work as expected, etc.

#### Return value

In an earlier design, we used a pattern object (which was a bundle of method handles) as the return value of the pattern. This enabled clients to invoke these via condy and bind method handles into the constant pool for deconstruction and static patterns.

Either way, we make use of some sort of carrier object to carry the bindings from the pattern to the client; either we return the carrier from the pattern method, or there is a method on the pattern object that we invoke to get a carrier.  We have a few preferences about the carrier; we'd like to be able to late-bind to the actual implementation (i.e., we don't want to freeze the name of a carrier class in the method descriptor), and at least for records, we'd like to let the record instance itself be the carrier (since it is immutable and we can just invoke the accessors to get the bindings.)

#### Carriers

As part of the work on template strings, Jim has put back some code that was originally written for the purpose of translating patterns, called "carriers".  There are methods / bootstraps that take a MethodType and return method handles to (a) encode values of those types into an opaque carrier object and (b) pull individual values out of a carrier.  This means that the choice of carrier object can be deferred to runtime, as long as both the bundling and unbundling methods handles agree on the carrier form.

The choice of carrier is largely a footprint/specificity tradeoff.  One could imagine a carrier class per shape, or a single carrier class that wraps an Object[], or caching some number of common shapes (three ints and two refs).  This sort of tuning should be separate from the protocol encoded in the bytecode of the pattern method and its clients.

The pattern matching runtime will provide some condy bootstraps which wrap the Carriers behavior.

Since at least some patterns are conditional, we have to have a way to encode failure into the protocol.  For a partial pattern, we can use a B2 carrier and use null to encode failure to match; for a total pattern, we can use a B3 carrier.

#### Proposed encoding

Earlier explorations did a lot of work to preserve the optimization that a match target can be its own carrier.  But further analysis reveals that the cost of doing so for other than records is pretty substantial and works against the model of a pattern declaration being an imperative body of code that runs at match time.  So for record patterns, we can "inline" them by using `instanceof` as the applicability test and accessors for extraction, and for all other patterns, go through the carrier runtime.

This allows us to encode pattern methods as

    Object name$mangle(ARGS)

and have the pattern method do the match and return a carrier (or null), using the carrier object that the carrier runtime associates with the pattern descriptor.  And clients can take apart the result again using the extraction logic that the carrier runtime associates with the pattern descriptor.

This also means that instance patterns "just work" because virtual dispatch selects the right implementation for us automatically, and all implementations that can be overrides will also implicitly agree on the encoding.

Because patterns are methods, we can take advantage of all the affordances of methods.  We can use access bits to control accessibility in the obvious way; we can use the attributes that carry annotations, method parameter metadata, and generics signatures to carry information about the pattern declaration and its parameters.  What's missing is a place to put metadata for the *bindings*, and to record the fact that this is a pattern implementation and not an ordinary method.  So, we add the following attribute on pattern methods:

    Pattern {
        u2 attr_name;
        u4 attr_length;
        u2 patternFlags; // bitmask
        u2 patternName;  // index of UTF8 constant
        u2 patternDescr; // index of MethodType (or alternately UTF8) constant
        u2 attributes_count;
        attribute_info attributes[attributes_count];
    }

This says that "this method is a pattern", reifies the name of the pattern (patternName), reifies the pattern descriptor (patternDescr) which encodes the types of the bindings as a method descriptor or MethodType, and has attributes which can carry annotations, parameter metadata, and signature metadata for the bindings.   The existing attributes (e.g. Signature, ParameterNames, RVAA) can be reused as is, with the interpretation that this is the signature (or names, or annos) of the *bindings*, not the input parameters.  Flags can carry things like "deconstructor pattern" or "partial pattern" as needed.

## Reflection

We already have a sensible base class in the reflection library for reflecting patterns: Executable.  All of the methods on Executable make sense for patterns, including Object as the return type.  If the pattern is reflectively invoked, it will return null (for no match) or an Object[]; this Object[] can be thought of as the boxing of the carrier.  Since the method return type is Object, this is an entirely reasonable interpretation.

We need some additional methods to describe the bindings, so we would have a subtype of Executable for Pattern, with methods like getBindings(), getAnnotatedBindings(), getGenericBindings(), isDeconstructor(), isPartial(), etc.

## Summary

This design borrows from previous rounds, but makes a number of simplifications.

 - The bindings of a pattern are captured in a MethodType, called the _pattern descriptor_.  The parameters of the pattern descriptor are the types of the bindings; the return type is the minimal type that will match the pattern (but is not as important as the bindings.)  - Patterns are translated as methods whose names are derived, deterministically, from the name of the pattern and the erasure of the pattern descriptor.  These are called pattern methods. Pattern methods take as parameters the input parameters of the pattern, and return Object.  - The returned object is an opaque carrier.  Null means the pattern didn't match.  A non-null value is the carrier type (from the carrier runtime) which is derived from the pattern descriptor.  - Pattern methods are not directly invocable from the source language; they are invoked indirectly through pattern matching, or reflection.  - Generated code invokes the pattern method and interprets the returned value according to the protocol, using MHs from the pattern runtime to access the bindings.  - Pattern methods have a Pattern attribute, which captures information about the pattern as a whole (is a total/partial, a deconstructor, etc) and parameter-related attributes which describe the bindings.  - Patterns are reflected through a new subtype of Executable, which exposes new methods to reflect over bindings.  - When invoking a pattern method reflectively, the carrier is boxed to an Object[].

Reply via email to