Author: larry
Date: Sun Jan  7 00:50:30 2007
New Revision: 13515

Modified:
   doc/trunk/design/syn/S03.pod

Log:
Smartmatching is now hopefully more consistent, extensible, and optimizable.
(Suggestion to use single dispatch semantics on pattern was from luqui++.)
After single dispatch, pattern can then choose to multi-dispatch the topic.
The new table is just the first whack at matching under new rules, so please
consider the individual entries and their semantics to still be negotiable.


Modified: doc/trunk/design/syn/S03.pod
==============================================================================
--- doc/trunk/design/syn/S03.pod        (original)
+++ doc/trunk/design/syn/S03.pod        Sun Jan  7 00:50:30 2007
@@ -12,9 +12,9 @@
 
   Maintainer: Larry Wall <[EMAIL PROTECTED]>
   Date: 8 Mar 2004
-  Last Modified: 4 Jan 2007
+  Last Modified: 6 Jan 2007
   Number: 3
-  Version: 83
+  Version: 84
 
 =head1 Changes to Perl 5 operators
 
@@ -596,87 +596,221 @@
 
 =head1 Smart matching
 
-Below is the current table of smart matches.  The list is intended
-to reflect forms that can be recognized at compile time.  To avoid
-explosion of options, the following types are remapped for the
-compile-time lookup only:
+Here is the table of smart matches for standard Perl 6
+(that is, the dialect of Perl in effect at the start of your
+compilation unit).  Smart matching is generally done on the current
+"topic", that is, on C<$_>.  In the table below, C<$_> represents the
+left side of the C<~~> operator, or the argument to a C<given>,
+or to any other topicalizer.  C<$x> represents the pattern to be
+matched against on the right side of C<~~>, or after a C<when>.
+
+The first section contains privileged syntax; if a match can be done
+via one of those entries, it will be.   These special syntaxes are
+dispatched by their form rather than their type.  Otherwise the rest
+of the table is used, and the match will be dispatched according to
+the normal method dispatch rules.  The optimizer is allowed to assume
+that no additional match operators are defined after compile time,
+so if the pattern types are evident at compile time, the jump table
+can be optimized.  However, the syntax of this part of the table
+is still somewhat privileged, insofar as the C<~~> operator is one
+of the few operators in Perl that does not use multiple dispatch.
+Instead, type-based smart matches singly dispatch to an underlying
+method belonging to the C<$x> pattern object.
+
+In other words, smart matches are dispatched first on the basis of the
+pattern's form or type (the C<$x> below), and then that pattern itself
+decides whether and how to pay attention to the type of the topic
+(C<$_>).  So the second column below is really the primary column.
+The C<Any> entries in the first column indicate a pattern that either
+doesn't care about the type of the topic, or that picks that entry
+as a default because the more specific types listed above it didn't match.
+
+    $_        $x        Type of Match Implied   Match if
+    ======    =====     =====================   =============
+    Any       Code:($)  scalar sub truth        $x($_)
+    Any       Code:()   simple closure truth    $x() (ignoring $_)
+    Any       undef     undefined               not defined $_
+    Any       *         block signature match   block successfully binds to |$_
+    Any       .foo      method truth            ?any($_.foo)
+    Any       .foo(...) method truth            ?any($_.foo(...))
+    Any       .(...)    list sub call truth     ?any($_(...))
+    Any       .[...]    array value slice truth ?any($_[...])
+    Any       .{...}    hash value slice truth  ?any($_{...})
+    Any       .<...>    hash value slice truth  ?any($_<...>)
+
+    Any       Bool      simple truth            $x.true given $_
+
+    Num       Num       numeric equality        +$_ == $x
+    Capture   Num       numeric equality        +$_ == $x
+    Array     Num       array contains number   any(@$_) == $x
+    Hash      Num       hash key existence      $_.exists($x)
+    Byte      Num       numeric equality        +$_ == $x
+    Any       Num       numeric equality        +$_ == $x
+
+    Str       Str       string equality         $_ eq $x
+    Capture   Str       string equality         ~$_ eq $x
+    Array     Str       array contains string   any(@$_) eq $x
+    Hash      Str       hash key existence      $_.exists($x)
+    Byte      Str       string equality         ~$_ eq $x
+    Any       Str       string equality         ~$_ eq $x
+
+    Buf       Buf       buffer equality         $_ eq $x
+    Str       Buf       string equality         $_ eq Str($x)
+    Array     Buf       arrays are comparable   $_ »===« @$x
+    Hash      Buf       hash key existence      $_.exists($x)
+    Any       Buf       buffer equality         Buf($_) eq $x
+
+    Buf       Byte      buffer contains byte    $_.match(/$x/)
+    Str       Byte      string contains byte    Buf($_).match(/$x/)
+
+    Str       Char      string contains char    $_.match(/$x/)
+    Buf       Char      string contains char    Str($_).match(/$x/)
+
+    Set       Set       identical sets          $_ === $x
+    Hash      Set       hash keys same set      $_.keys === $x
+    Array     Set       array equiv to set      Set($_) === $x
+    Any       Set       identical sets          Set($_) === $x
+
+    Array     Array     arrays are comparable   $_ »===« $x
+    Buf       Array     arrays are comparable   @$_ »===« $x
+    Str       Array     array contains string   any(@$x) eq $_
+    Num       Array     array contains number   any(@$x) == $_
+    Hash      Array     hash slice exists       $_.exists(any(@$x))
+    Scalar    Array     array contains object   any(@$x) === $_
+    Set       Array     array equiv to set      $_ === Set($x)
+    Any       Array     lists are comparable    @$_ »===« $x
+
+    Hash      Hash      hash keys same set      $_.keys === $x.keys
+    Set       Hash      hash keys same set      $_ === $x.keys
+    Array     Hash      hash slice existence    $x.exists(any @$_)
+    Regex     Hash      hash key grep           any($_.keys) === /$x/
+    Scalar    Hash      hash entry existence    $x.exists($_)
+    Any       Hash      hash slice existence    $x.exists(any @$_)
+
+    Str       Regex     string pattern match    $_.match($x)
+    Hash      Regex     hash key grep           any($_.keys) === /$x/
+    Array     Regex     match array as string   cat(@$_).match($x)
+    Any       Regex     pattern match           $_.match($x)
+
+    Num       Range     in numeric range        $x.min <= $_ <= $x.max (mod 
^'s)
+    Str       Range     in string range         $x.min le $_ le $x.max (mod 
^'s)
+    Any       Range     in generic range        [!after] $x.min,$_,$x.max 
(etc.)
+
+    Any       Type      type membership         $_.does($x)
+
+    Signature Signature sig compatibility       $_ is a subset of $x      ???
+    Code      Signature sig compatibility       $_.sig is a subset of $x  ???
+    Capture   Signature parameters bindable     $_ could bind to $x (doesn't!)
+    Any       Signature parameters bindable     |$_ could bind to $x (doesn't!)
+
+    Signature Capture   parameters bindable     $x could bind to $_
+
+    Set       Scalar    set member exists       any($_.keys) === $x
+    Hash      Scalar    hash key exists         any($_.keys) === $x
+    Array     Scalar    array contains item     any(@$_) === $x
+    Scalar    Scalar    scalars are identical   $_ === $x
+
+All smartmatch types are scalarized; both C<~~> and C<given>/C<when>
+provide scalar contexts to their arguments, and autothread any
+junctive matches so that the eventual dispatch to C<.accepts> never
+sees anything "plural".  So both C<$_> and C<$x> above are potentially
+container objects that are treated as scalars.  (You may hyperize
+C<~~> explicitly, though.  In this case all smartmatching is done
+using the type-based dispatch to C<.accepts>, not the form-based
+dispatch at the front of the table.)
+
+The exact form of the underlying type-based method dispatch is:
+
+    $x.accepts($_)      # for ~~
+    $x.rejects($_)      # for !~~
+
+As a single dispatch call this pays attention only to the type of
+C<$x> initially.  The C<accepts> method interface is defined by the
+C<Pattern> role.  Any class composing the C<Pattern> role may choose
+to provide a single C<accepts> method to handle everything, which
+corresponds to those pattern types that have only one entry with
+an C<Any> on the left above.  Or the class may choose to provide
+multiple C<accepts> multi-methods within the class, and these
+will then redispatch within the class based on the type of C<$_>.
+The class may also define one or more C<rejects> methods; if it does
+not, the default C<rejects> method from the C<Pattern> role defines
+it in terms of a negated C<accepts> method call.  This generic method
+may be less efficient than a custom C<rejects> method would be, however.
+
+The smartmatch table is primarily intended to reflect forms and types that
+are recognized at compile time.  To avoid an explosion of entries,
+the table assumes the following types will behave similarly:
 
     Actual type                 Use entries for
     ===========                 ===============
     List Seq                    Array
     KeySet KeyBag KeyHash       Hash
-    .{Any} .<string> .[number]  .method
     Class Subset Enum Role      Type
-    Subst                       Regex
+    Subst Grammar               Regex
     Buf Char LazyStr            Str
     Int UInt etc.               Num
+    Match                       Capture
 
-Note that all types are scalarized.  Both C<~~> and C<given>/C<when>
-provide scalar contexts to their arguments.  (You can always
-hyperize C<~~> explicitly, though.)  So both C<$_> and C<$x> here
-are potentially container objects.  The first section contains
-privileged syntax; if a match can be done via one of those entries,
-it will be.  Otherwise the rest of the table is used, and the match
-will be dispatched according to the normal rules of multiple dispatch;
-however, the optimizer is allowed to assume that no C<< infix:<~~> >>
-operators are added at run time, so if the argument types are evident
-at compile time, the jump table can be optimized.  By definition all
-normal arguments can be matched to at least one of the entries below.
-
-    $_      $x        Type of Match Implied    Match if
-    ======  =====     =====================    =============
-    Any     Code:($)  scalar sub truth         $x($_)
-    Any     .method   method truth*            $_.method
-    Any     boolean   simple expression truth* $x.true given $_
-    Any     undef     undefined                not defined $_
-    Any     *         default                  True
-
-    Num     Num       numeric equality         $_ == $x
-    Num     Junction  numeric equality         $_ == $x
-    Str     Str       string equality          $_ eqv $x
-    Str     Junction  string equality          $_ eqv $x
-
-    Hash    Hash      hash keys identical sets $_.keys === $x.keys
-    Hash    Array     hash value slice truth   $_{any(@$x)}
-    Hash    Junction  hash key slice existence $_.exists($x)
-    Hash    Regex     hash key grep            any($_.keys) === /$x/
-
-    Array   Array     arrays are comparable    $_ »===« $x
-    Array   Regex     match array like string  cat(@$_) ~~ $x
-    Array   Junction  list intersection        any(@$_) ~~ $x
-    Array   Num       array contains number    any($_) == $x
-    Array   Str       array contains string    any($_) eqv $x
-    Array   Buf       array equivalent to buf  $_ eqv Array($x)
-    Array   Set       array equivalent to set  Set($_) === $x
-
-    Code    Signature signature compatibility* $_ is a subset of $x
-  Signature Signature signature compatibility  $_ is a subset of $x
-
-    Hash    Any       hash entry existence     exists $_{$x}
-    Array   Any       array contains item*     any($_) === $x
-    Any     Signature parameter binding        $_ can bind to $x
-    Any     Range     in range                 [!after] $x.min,$_,$x.max (etc.)
-    Any     Regex     pattern match            $_.match($x)
-    Any     Type      type membership          $_.does($x)
-    Any     Code:()   simple closure truth*    $x() (ignoring $_)
-    Any     Any       run-time dispatch        infix:<~~>($_, $x)
-
-Matches marked with * are non-reversible, typically because C<~~> takes
-its left side as the topic for the right side, and sets the topic to a
-private instance of C<$_> for its right side, so C<$_> means something
-different on either side.  Such non-reversible constructs can be made
-reversible by putting the leading term into a closure to defer the
-binding of C<$_>.  For example:
-
-    $x ~~ .does(Storable)       # okay
-    .does(Storable) ~~ $x       # not okay--gets wrong $_ on left
-    { .does(Storable) } ~~ $x   # okay--closure binds its $_ to $x
-
-Exactly the same consideration applies to C<given> and C<when>:
-
-    given $x { when .does(Storable) {...} }      # okay
-    given .does(Storable) { when $x {...} }      # not okay
-    given { .does(Storable) } { when $x {...} }  # okay
+(Note, however, that these mappings can be overridden by explicit
+definition of the appropriate C<accepts> and C<rejects> methods.
+If the redefinition occurs at compile time prior to analysis of the
+smart match then the information is also available to the optimizer.)
+
+Matching against a C<Grammar> object will call the first rule defined
+in the grammar.
+
+Matching against a C<Signature> does not actually bind any variables,
+but only tests to see if the signature I<could> bind.  To really bind
+to a signature, use the C<*> pattern to delegate binding to the C<when>
+statement's block instead.  Matching against C<*> is special in that
+it takes its truth from whether the subsequent block is bound against
+the topic, so you can do ordered signature matching:
+
+    given $capture {
+        when * -> Int $a, Str $b { ... }
+        when * -> Str $a, Int $b { ... }
+        when * -> $a, $b         { ... }
+        when *                   { ... }
+    }
+
+This can be useful when the unordered semantics of multiple dispatch
+are insufficient for defining the "pecking order" of code.  Note that
+you can bind to either a bare block or a pointy block.  Binding to a
+bare block conveniently leaves the topic in C<$_>, so the final form
+above is equivalent to a C<default>.  (Placeholders parameters may
+also be used in the bare block form, though of course their types
+cannot be specified that way.)
+
+There is no pattern matching defined for the C<Any> pattern, so if you
+find yourself in the situation of wanting a reversed smartmatch test
+with an C<Any> on the right, you can almost always get it by explicit
+call to the underlying C<accepts> method using $_ as the pattern.
+For example:
+
+    $_      $value    Type of Match Wanted   What to use on the right
+    ======  ======    ====================   ========================
+    Code    Any       scalar sub truth       .accepts($value) or .($value)
+    Range   Any       in range               .accepts($value)
+    Type    Any       type membership        .accepts($value) or .does($value)
+    Regex   Any       pattern match          .accepts($value)
+    etc.
+
+Similar tricks will allow you to bend the default matching rules for
+composite objects as long as you start with a dotted method on $_:
+
+    given $somethingordered {
+        when .values.'[<=]'     { say "increasing" }
+        when .values.'[>=]'     { say "decreasing" }
+    }
+
+In a pinch you can define a macro to do the "reversed when":
+
+    my macro statement_control:<accepts> () { "when .accepts: " }
+    given $pattern {
+        accepts $a      { ... }
+        accepts $b      { ... }
+        accepts $c      { ... }
+    }
 
 Boolean expressions are those known to return a boolean value, such
 as comparisons, or the unary C<?> operator.  They may reference C<$_>
@@ -703,38 +837,10 @@
 a boolean context.  However, for certain operands such as regular
 expressions, use of the operator within scalar or list context transfers
 the context to that operand, so that, for instance, a regular expression
-can return a list of matched substrings, as in Perl 5.  The complete
-list of such operands is TBD.
-
-The C<~~> operator is intended primarily for compile-time resolution,
-and if the types of the operands resolve at compile time according
-to the table above, any C<< infix:<~~> >> routines added later are
-completely ignored.  If the types cannot be matched at compile time,
-(that is, if the arguments match only the Any/Any rule at compile
-time), the match is deferred to a true run-time multple dispatch to
-all C<< infix:<~~> >> infix definitions that exist at the moment.
-
-The run-time C<< infix:<~~> >> definitions are intended to reproduce
-as closely as possible the compile-time table above, but it can do
-this based only on the run-time types of the arguments.  Therefore
-only the entries above that indicate a type on both sides can be
-dispatched that way.  (You can tell those because both sides start
-with a capital letter.  So multiple dispatch ignores the ".method",
-"boolean", "undef", and "*" entries in the first section, which are
-recognized syntactically, not by type.)
-
-If there is no appropriate signature match under the rules of multiple
-dispatch, the most generic multi definition of C<< infix:<~~> >>
-defaults to calling C<===> to match the two variables exactly
-according to their type.  In general you should just rely on this
-and not attempt to define your own C<< infix:<~~> >> operators,
-because complexifying the run-time semantics of C<~~> is not doing
-anyone a favor.  This is one of those mechanisms we provide knowing
-that people I<will> shoot themselves in the foot with it.  However,
-we also recognize that we probably aren't aware of all useful forms of
-pattern matching, especially the ones that haven't been invented yet.
-We choose to make it possible to add such forms using C<~~>.  Please
-construe this as future proofing, not idiot proofing.
+can return a list of matched substrings, as in Perl 5.  This is done
+by returning an object that can return a list in list context, or that
+can return a boolean in a boolean context.  In the case regex matching
+the C<Match> object is a kind of C<Capture>, which has these capabilities.
 
 For the purpose of smartmatching, all C<Set> and C<Bag> values are
 considered to be of type C<KeyHash>, that is, C<Hash> containers
@@ -1386,7 +1492,7 @@
     for all(@foo) {...}
 
 it indicates to the compiler that there is no coupling between loop
-iterations and they can be run in any order or even in parallel.
+iterations and they can be run in any order or even in parallel.  XXX bogus
 
 Use of negative operators with syntactically recognizable junctions may
 produce a warning on code that works differently in English than in Perl.

Reply via email to