This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Objects : Native support for multimethods =head1 VERSION Maintainer: Damian Conway <[EMAIL PROTECTED]> Date: 18 September 2000 Mailing List: [EMAIL PROTECTED] Number: 256 Version: 1 Status: Developing =head1 ABSTRACT This RFC proposes that Perl natively support multiply dispatched subroutines (multimethods). =head1 DESCRIPTION Multimethods solve a class of problems in which objects of two or more hierarchies must interact polymorphically, but where the nature of the interaction is determined by the run-time type of I<both> (I<all>) the objects. =head2 Theoretical model It is proposed that calls to certain (explicitly marked) subroutines be dispatched using a different mechanism from that used for regular Perl 5 subroutines, or that used for Perl 5 methods. These explicitly marked subroutines would share the same name, but provide unique parameter lists. Collectively all marked subroutines with the same name but different parameters lists would be known as a I<multimethod>, with each individual subroutine known as a I<variant> of the multimethod. The new dispatch mechanism would look at the classes or types of each argument to the subroutine (by calling C<ref> on each) and determine the "closest" matching variant of the multimethod, according to the parameter types specified in the variants' parameter list. The result would subsume the behaviour of function overloading (e.g. in C++) but in a more sophisticated manner, since multimethods take the run-time inheritance relationships of each argument into account. Another way of thinking of the mechanism is that it would perform polymorphic dispatch on every argument of a multimethod, not just on the first. =head2 Defining multimethods A single variant of a multimethod would be defined by specifying a subroutine with a full parameter list (as per RFC 128), and appending the attribute C<:multi>. Two or more such subroutines could be defined with the same name, in the same package namespace, provide they were both defined with the C<:multi> attribute, and their parameter lists were not identical. For example: package LargeNum; package LargeInt; use base 'LargeNum'; package LargeFloat; use base 'LargeNum'; package LargeMath; sub divide(LargeInt $a, LargeInt $b) : multi { ... } sub divide(LargeInt $a, LargeFloat $b) : multi { ... } This creates a (single) multimethod called C<LargeMath::divide> with two variants. If the multimethod is called with two references to LargeInt objects as arguments: $x = LargeInt->new(1234567890); $y = LargeInt->new(9876543210); $quotient = divide($x, $y); then the first variant is invoked. If the multimethod is called with a LargeInt reference and a LargeFloat reference: $x = LargeInt->new(1234567890); $y = LargeFloat->new(9876543210.12345); $quotient = divide($x, $y); then the second variant is called. Calling the multimethod with any other combination of LargeNum-derived reference arguments (e.g. a reference to a LargeFloat and a reference to a LargeInt, or two LargeFloat references) results in an exception being thrown, with the message: No viable candidate for call to multimethod divide To avoid this, a "catch-all" variant could be specified: sub divide(LargeNum $b, LargeNum $b) : multi { ... } Now, calling C<divide> with (for example) a LargeFloat reference and a LargeInt reference causes this third variant to be invoked. That's because a LargeFloat I<is-a> LargeNum (so the first argument is compatible with the first parameter), and a LargeInt I<is-a> LargeNum too (so the second argument is compatible with the second parameter). Note that adding this third variant doesn't affect calls to the other two, since multiple dispatch always selects the nearest match. =head2 Finding the "nearest" multimethod The usefulness of multiple dispatch depends on how intelligently the dispatcher decides which variant of a multimethod is "nearest" to a given set of arguments. That decision process is called I<dispatch resolution>, and it is proposed that it be done (or I<appear> to be done, modulo optimization) like this: =over 4 =item 1. If the types of the arguments given (as determined by applying C<ref> to each in turn) exactly match the set of parameter types specified in any variant of the multimethod, that variant is immediately called. =item 2. Otherwise, the dispatch mechanism compiles a list of viable targets. A viable target is a variant of the multimethod with the same number of parameters as there were arguments passed. In addition, for each parameter the specified parameter type must be a base class of the actual type of the corresponding argument in the actual call (i.e. each argument must belong to a subclass of the corresponding parameter type). =item 3. If there is only one viable target, it is called immediately. If there are no viable targets, an exception is thrown indicating that the multimethod can't handle the specific set of arguments. (but see L<Handling dispatch failure> below). =item 4. Otherwise, the dispatch mechanism examines each viable target and computes its inheritance distance from the actual set of arguments. The inheritance distance from a single argument to the corresponding parameter is the number of inheritance steps between their respective classes (working up the tree from argument to parameter). If there's no inheritance path between them, the distance is infinite. The inheritance distance for a set of arguments is just the sum of their individual inheritance distances. =item 5. The dispatch mechanism then chooses the viable target with the smallest inheritance distance as the actual target. If more than one viable target has the same smallest distance, the call is ambiguous. In that case, the dispatch process fails and an exception is thrown (but see L<Handling dispatch failure> below) If there's only a single actual target, its identity is cached (to accelerate subsequent dispatching), and then the actual target is invoked. =back =head2 Handling built-in types Because the multiple dispatch mechanism applies C<ref> to determine the type of each argument, it is indifferent to whether those arguments are references to blessed objects, or just regular Perl data types. That means that multimethod variants can also be defined to process built-in data types: sub stringify(ARRAY $a) : multi { my @arg = map { stringify $_ } @a; return "[" . join(", ",@a) . "]"; } sub stringify(HASH $h) : multi { my %arg = map { stringify $_ } %h; return "{" . join(", ", map( "$_=>$h{$_}", keys %h) ) . "}"; } sub stringify(CODE $c) : multi { return "sub {???}"; } sub stringify(Regexp $r) : multi { return "qr/$r/"; } sub stringify(UNIVERSAL $r) : multi { # *any* type of object return $obj->stringify() if $obj->can('stringify'); return "???"; } sub stringify($scalar) : multi { # catch-all if ref() eq "" return qq{'$scalar'}; } # and later... print stringify([1,2,3]); print stringify({a=>1,b=>[1,2,3],c=>MyClass->new(),d=>qr/#.*/}); print stringify(42); Notes: =over 4 =item * In the above example, C<stringify> is used recursively in processing the contents of arrays and hashes. This ensures that nested components (as in the second of the "and later..." examples) are correctly stringified. This indicates some of the polymorphic power of multimethods, in that the single C<map>-ped call to C<stringify> suffices to correctly stringify every array element and hash entry, no matter what type that nested element or entry is. =item * The parameter type C<UNIVERSAL> provides a means of writing a single variant that captures references to I<any> blessed object, since every blessed object automatically inherits from C<UNIVERSAL>. =item * A variant whose argument is untyped (i.e. the last variant in the above example) represents a "catch-all" case, which in this particular case would receive all calls to C<stringify> which had non-reference scalar arguments. If the C<stringify(UNIVERSAL)> variant did not exist, the untyped variant would also receive all calls with blessed objects as arguments. =back =head2 Handling dispatch failure It's relatively easy to set up a multimethod such that particular combinations of argument types cannot be correctly dispatched. For example, consider the following variants of a multimethod called C<put_peg>: package RoundPeg; use base 'Peg'; package SquareHole; use base 'Hole'; sub put_peg(RoundPeg,Hole) : multi { print "round peg in generic hole\n" } sub put_peg(Peg,SquareHole) : multi { print "generic peg in square hole\n" } sub put_peg(Peg,Hole) : multi { print "generic peg in generic hole\n" } If C<put_peg> is called like this: my $peg = RoundPeg->new(); my $hole = SquareHole->new(); put_peg($peg, $hole); then the call can't be dispatched, because there's no way to decide between the variants C<(RoundPeg,Hole)> and C<(Peg,SquareHole)>, each of which is the same inheritance distance (i.e. 1 derivation) from the actual arguments. The response to this situation (proposed above) is to throw an exception like this: Cannot resolve call to multimethod put_peg(RoundPeg,SquareHole). The multimethods: put_peg(RoundPeg,Hole) put_peg(Peg,SquareHole) are equally viable This is the obvious behaviour, since the case truly I<is> ambiguous. Some languages (e.g. CLOS) take a different approach -- breaking the tie by a recount on the inheritance distance of each argument starting from the left. In this example, that would mean that the call would be dispatched to C<put_peg(RoundPeg,Hole)>, since the left-most parameter of that variant is a "closer" match than the left-most parameter of the C<put_peg(Peg,SquareHole)> variant. In the author's opinion, this approach is appalling, since it favours one parameter above all others for no better reason than it comes first in the argument list. But there is no reason why pegs should be more significant than holes. Moreover, arbitrarily resolving the dispatch in this way will often mask a fundamental design flaw in the multimethod. However, experience indicates that sometimes the more specialized variants of a multimethod are only provided as optimizations, and a more general variant (in this case, the C<(Peg,Hole)> variant) would suffice as a default where such an ambiguity exists. It is proposed that an additional parameterized attribute -- C<:default(ambiguous)> -- be provided so that one particular multimethod can be nominated as the dispatch recipient in cases of ambiguity: sub put_peg(Peg,Hole) : multi default(ambiguous) { print "some kinda peg in some kinda hole\n" } Now, whenever a call to C<put_peg> can't be dispatched because it's ambiguous, this default variant will be called, rather than throwing an exception. Multiple dispatch can also fail if there are I<no> suitable variants available to handle a particular call. For example: my $peg = JPEG->new(); my $hole = Loophole->new(); put_peg($peg, $hole); which, it is proposed above, would normally produce the exception: No viable candidate for call to multimethod put_peg(JPEG,Loophole) The classes JPEG and Loophole aren't in the Peg and Hole hierarchies, so there's no inheritance path back to a more general C<(Peg,Hole)> variant. It is proposed that the C<:default> attribute might also take an parameter C<failure>, which would indicate that the attributed variant should be called when dispatch fails to find any suitable variant: sub put_peg(@args) : multi default(failure) { print "Unknown kinda peg in unknown kinda hole\n"; } It is further proposed that a single variant could cover I<both> types of default behaviour, if it were specified with just the (unparameterized) C<default> attribute: sub put_peg(@args) : multi default { print "Any kinda peg in any kinda hole\n"; } =head2 Multimethod redispatch It is further proposed that the semantics of the proposed NEXT pseudoclass (RFC 190) be extended to cover redispatch of multimethods. Specifically, it is proposed that within a multimethod variant, any call of the form C<&NEXT::foo;> to the same multimethod would cause the dispatch mechanism to resume its dispatch process at the next most viable candidate(s) and thus select the next closest variant. For example, given the following multimethods: package Object; package HardObject; use base 'Object'; package SoftObject; use base 'Object'; package Simulation; sub collide(Object $obj1, Object $obj2) { print STDLOG "$obj1->{id} collided with $obj2->{id}"; } sub collide(HardObject $obj1, HardObject $obj2) { print "crunch!"; &NEXT::collide; } sub collide(SoftObject $obj1, SoftObject $obj2) { print "gloop!"; &NEXT::collide; } In this case, if one of the two more specialized collision variants is called, the more general variant that logs all collisions would also be invoked (and passed the same arguments). =head1 MIGRATION ISSUES None. =head1 IMPLEMENTATION See the Class::Multimethods module. =head1 REFERENCES "Object Oriented Perl", Chapter 13. RFC 128: Subroutines: Extend subroutine contexts to include name parameters and lazy arguments RFC 190 : NEXT pseudoclass for method redispatch