Author: larry Date: Thu Aug 10 17:11:54 2006 New Revision: 10804 Modified: doc/trunk/design/syn/S02.pod doc/trunk/design/syn/S06.pod
Log: First whack at defining semantics of MAIN subs. Typo from Aaron Crane++. Modified: doc/trunk/design/syn/S02.pod ============================================================================== --- doc/trunk/design/syn/S02.pod (original) +++ doc/trunk/design/syn/S02.pod Thu Aug 10 17:11:54 2006 @@ -440,7 +440,7 @@ Some object types can behave as value types. Every object can produce a "safe key identifier" (C<SKID> for short) that uniquely identifies the -object for hashing and other value-base comparisons. Normal objects +object for hashing and other value-bases comparisons. Normal objects just use their address in memory, but if a class wishes to behave as a value type, it can define a C<.SKID> method that makes different objects look like the same object if they happen to have the same contents. Modified: doc/trunk/design/syn/S06.pod ============================================================================== --- doc/trunk/design/syn/S06.pod (original) +++ doc/trunk/design/syn/S06.pod Thu Aug 10 17:11:54 2006 @@ -13,9 +13,9 @@ Maintainer: Larry Wall <[EMAIL PROTECTED]> Date: 21 Mar 2003 - Last Modified: 9 Aug 2006 + Last Modified: 10 Aug 2006 Number: 6 - Version: 45 + Version: 46 This document summarizes Apocalypse 6, which covers subroutines and the @@ -1326,6 +1326,8 @@ =head2 Native types +[This stuff belongs in S02.] + Values with these types autobox to their uppercase counterparts when you treat them as objects: @@ -1912,31 +1914,36 @@ =head2 The C<leave> function -A C<return> call causes the innermost surrounding subroutine, -method, rule, token, regex (as a keyword), macro, or multimethod -to return. Only declarations with an explicit keyword such as "sub" -may be returned from. You may not return from a quotelike operator such -as C<rx//>. +As mentioned above, a C<return> call causes the innermost +surrounding subroutine, method, rule, token, regex (as a keyword), +macro, or multimethod to return. Only declarations with an explicit +keyword such as "sub" may be returned from. You may not return from +a quotelike operator such as C<rx//>. + +To return from other types of code structures, the C<leave> function +is used. The first argument, if supplied, specifies a C<Selector> +for the control structure to leave. The C<Selector> and will be +smart-matched against the dynamic scope objects from inner to outer. +The first that matches is the scope that is left. -To return from other types of code structures, the C<leave> function is used: +The remainder of the arguments are taken to be a Capture holding the +return values. leave; # return from innermost block of any kind + leave *; # same thing leave Method; # return from innermost calling method - leave &?ROUTINE <== 1,2,3; # Return from current sub. Same as: return 1,2,3 - leave &foo <== 1,2,3; # Return from innermost surrounding call to &foo - leave Loop where { .label eq 'COUNT' }; # Same as: last COUNT; + leave &?ROUTINE, 1,2,3; # Return from current sub. Same as: return 1,2,3 + leave &?ROUTINE <== 1,2,3; # same thing, force return as feed + leave &foo, 1,2,3; # Return from innermost surrounding call to &foo -Note that the last is equivalent to +Note that these are equivalent: + leave Loop where { .label eq 'COUNT' }; last COUNT; and, in fact, you can return a final loop value that way: - last COUNT <== 42; - -If supplied, the first argument to C<leave> is a C<Selector>, and will -be smart-matched against the dynamic scope objects from inner to outer. -The first that matches is the scope that is left. + last COUNT, 42; =head2 Temporization @@ -2384,3 +2391,101 @@ C<< OUTER::<$varname> >> specifies the C<$varname> declared in the lexical scope surrounding the current lexical scope (i.e. the scope in which the current block was defined). + +=head2 Declaring a C<MAIN> subroutine + +Ordinarily a top-level Perl "script" just evaluates its anonymous +mainline code and exits. During the mainline code, the program's +arguments are available in raw form from the C<@ARGS> array. At the end of +the mainline code, however, a C<MAIN> subroutine will be called with +whatever command-line arguments remain in C<@ARGS>. This call is +performed if and only if: + +=over + +=item a) + +the compilation unit was directly invoked rather than +by being required by another compilation unit, and + +=item b) + +the compilation unit declares a C<Routine> named "C<MAIN>", and + +=item c) + +no explicit call to C<MAIN> was performed by the time the mainline code +finishes. + +=back + +The command line arguments (or what's left of them after mainline +processing) is magically converted into a C<Capture> and passed to +C<MAIN> as its arguments, so switches may be bound as named args and +other arguments to the program may be bound to positional parameters +or the slurpy array: + + sub MAIN ($directory, :$verbose, *%other, [EMAIL PROTECTED]) { + for @filenames { ... } + } + +If C<MAIN> is declared as a method, the name of the invoked +program is passed as the "invocant". If C<MAIN> is declared as a +set of multi subs, MMD dispatch is performed. + +As with module and class declarations, a sub or method declaration +ending in semicolon is allowed at the outermost file scope if it is the +first such declaration, in which case the rest of the file is the body: + + sub MAIN ($directory, :$verbose, *%other, [EMAIL PROTECTED]); + for @filenames { ... } + +Proto or multi definitions may not be written in semicolon form, +nor may C<MAIN> subs within a module or class. (A C<MAIN> routine +is allowed in a module or class, but is not usually invoked unless +the file is run directly (see a above). This corresponds to the +"unless caller" idiom of Perl 5.) In general, you may have only one +semicolon-style declaration that controls the whole file. + +If the dispatch to C<MAIN> fails the C<USAGE> routine is called. +If there is no such routine, a default message is printed. This +usage message is automatically generated from the signature (or +signatures) of C<MAIN>. This message is generated at compile time, +and hence is available at "mainline" time as the rw property +C<&USAGE.text>. You may also access it from your own C<USAGE> routine. + +Common Unix command-line conventions are mapped onto the capture +as follows: + + On command line... $ARGS capture gets... + + -name :name + -name=value :name<value> + -name="spacy value" :name«'spacy value'» + -name='spacy value' :name«'spacy value'» + -name=val1,'val 2', etc :name«val1 'val 2' etc» + + --name :name + --name=value :name<value> + --name="spacy value" :name«'spacy value'» + --name='spacy value' :name«'spacy value'» + --name=val1,'val 2', etc :name«val1 'val 2' etc» + -- # end named argument processing + + +name :!name + +name=value :name<value> but False + +name="spacy value" :name«'spacy value'» but False + +name='spacy value' :name«'spacy value'» but False + +name=val1,'val 2', etc :name«val1 'val 2' etc» but False + + :name :name + :!name :!name # potential conflict with ! histchar + :/name :!name # potential workaround? + :name=value :name<value> + :name="spacy value" :name«'spacy value'» + :name='spacy value' :name«'spacy value'» + :name=val1,'val 2', etc :name«val1 'val 2' etc» + +As usual, switches are assumed to be first, and any switches after C<--> +are treated as positionals or slurpy. Other policies may be introduced +by calling C<MAIN> explicitly.