=head1 TITLE

Short-circuiting built-in functions and user-defined subroutines

=head1 VERSION

  Maintainer: Garrett Goebel <[EMAIL PROTECTED]>
  Date: 15 Sep 2000
  Version: 3
  Mailing List: perl6-language
  Number: 199
  Status: Developing

=head1 ABSTRACT

Allow built-in functions and user defined subroutines to catch exceptions
and act upon them, particularly those emanating from control blocks. Put
another way, allow blocks to simultaneously exit early with a value.

=head1 DESCRIPTION

=head2 OVERVIEW

Currently there is not a practical and flexible way to short-circuit
built-in functions like C<grep> and C<map> which take blocks as a parameter.

A prime example that was raised is:

  my $found = grep { $_ == 1 } (1..1_000_000);

Under Perl 5, one would need to write something like:

  eval { grep { $_ ==1 and die "$_\n" } (1..1_000_000) }; 
  chomp(my $found = $@);

Under this current proposal this might be written as:

  my $found = grep { $_ == 1 and return $_ && last grep } (1..1_000_000);

  (or alternatively)

  my $found = grep { $_ == 1 and abort grep $_ } (1..1_000_000);

It is the opinion of the author that while the proposal saves little by
way of typing, that it is minimally an improvement in clarity. Further
this proposal introduces semantics which will make other tasks easier.

=head2 DEFINITION OF THE PROBLEM

=over 4

=item 1

How do you simultaneously exit early and return a value?

  reject the current element and continue
  accept the current element and continue
  reject the current element and abort
  accept the current element and abort
  reject the current element and retry
  accept the current element and retry

=item 2

How do you do that *and* be consistent with existing semantics?

=back

=head2 OPTIONS AND ALTERNATIVES

There has been much discussion and ideas exchanged on the best way to 
achieve this. Discussion ranging from specific solutions which would
only apply to the C<grep> and C<map> built-ins, to more general ones
such as unifying the exception syntax for loop, code, and bare blocks.
There has also been a vocal contingent protesting that all of the
suggestions attempt to shoehorn too much new functionality into
existing semantics and that no one has yet presented any workable
solutions.

The current ideas tend to converge around the following three options:

=over 4

=item 1

Grant code blocks their names as labels, and add new keywords
for code block control: C<continue>/C<abort>/C<retry>:

  # short-circuit after first acceptance...
  $found = grep { ($_ == 1) and abort grep $_ } (1..1_000_000);

  # short-circuit after first rejection...
  $found = grep { ($_ == 1) and abort grep } (1..1_000_000);

=item 2

Grant code blocks their own names as labels, and the ability to catch
next/last/redo exceptions which explicitly use those labels.

  # short-circuit after first acceptance...
  @smalls = grep { ($_ == 1) and return $_ && last grep } (1..1_000_000);

  # short-circuit after first rejection...
  @smalls = grep { ($_ == 1) and last grep } (1..1_000_000);

=item 3

The built-in function's operation should catch exceptions thrown from
the block, where the exception thrown is true or false (alternative: 1
or false). Example:

  # short-circuit after first acceptance...
  $found = grep { ($_ == 1) and die 1 } (1..1_000_000);

  # short-circuit after first rejection...
  $found = grep { ($_ == 1)  or die 0 } (1..1_000_000);

=back

=head2 INDECISIONS: BY THE PRICKINGS OF MY THUMB...

Detractors would argue:

=over 4

=item 1

First Proposal: Add code labels and new keywords:
C<continue>/C<abort>/C<retry>

=over 4

=item *

Aversion to taking any more new keywords

=back

=item 2

Second Proposal: Add code labels, and let code blocks use
C<next>/C<last>/C<redo> syntax which explicitly use those labels.

=over 4

=item *

Too complicated, changes too much, attempts to shoehorn to too many
semantic changes into the existing syntax/keywords

=back

=item 3

Third Proposal: Extend use of C<die>

=over 4

=item *

Built-ins shouldn't catch user-defined exceptions. How do you raise
a serious exception within a control block if you weaken C<die> into 
something that can be automatically caught by built-ins? I.e., no good
support for multi-level exceptions.

=item *

Doesn't provide a very intuitive syntax for doing the same type of
C<next>/C<last>/C<redo> control as is available with loop blocks.

=back

=back

The author shares much of detractors opinions for the third proposal, and
would prefer either the first or second to be implemented. The reason I
prefer
one of these two is that I believe they are more intuitive, flexible, and 
semantically rich. As such, the remainder of this RFC concern itself with
issues pertaining to the first two proposals.

=head2 FINER DETAILS AND OTHER ISSUES

=head3 Issues common to the first and second proposals:

=over 4

=item *

Built-ins and Subroutines automatically get their name as a label

It might therefore follow that it would be good to define C<goto foo> to
have the same semantics as C<goto &foo> when the label referred to is a
function. Namely to pass @_. It would then be possible if desired to
deprecate C<goto &foo>.

=item *

To support multi-level escaping, explicit named or numeric labels may
be used. Where numeric labels specify the level of nesting to be
escaped similarly to C<caller>. 0 refers to the current code block,
1 the caller's, and so forth. Otherwise C<undef> may also be used to refer
to the current code block.

Regardless of whether or not this proposal is implemented, it might be
nice to add the same numeric label semantics when labels are used with loop
control statements.

Examples:

  sub func1 { @_ ? &func2 : abort undef, 0 }
  sub func2 { $_[0] eq undef ? abort func1, 0 : &func3 }
  sub func3 { $_[0] eq 'foo' ? abort 2, 0 : 1 }

=item *

The idea of being able to both short-circuit a block and return a value
may create confusion as to the context in which the value is being returned.
I've slightly modified one of John Porters examples which illustrate the
various blocks and their controls:

  sub foo {               # Code Block  
    eval {                # Bare Block
      for (...) {         # Loop Block
  
        last && return 1; # Note: all
        die && return 1;  # of these go
        abort 1, 1        # to different
        return 1;         # places
  
      }
    };
  }

This example doesn't confuse the author. But that isn't to say that others
won't find it confusing.

=back

=head3 Issues specific to adding code block control statements

=over 4

=item *

When using code block control statements, labels are required but return
values are not.  This is necessary to support optionally passing a return
value.

=back

=head3 Issues specific to adding support for loop control syntax to code
control

=over 4

=item *

Let C<return $value && next LABEL> be used to support both short-circuiting
and returning a value

There was some consensus that C<return> is a good fit for returning a value
from a control block. But it doesn't support multi-level escaping. Therefore
the use of loop control syntax is tacked on. The question becomes if both 
C<return> and C<last> are both exceptions which exit the block early, how
do you catch them both and handle them appropriately? Is C<&&> sufficient
to the cause? From the author's perspective it looks fine and dandy. But I
have little or no idea how this would work in regard to Perl internals.

=item *

Need to stop calling them loop control statements and start referring to
them as short-circuit or statements commonly used with loop control

=item *

Optionally, we might also consider allowing people to both raise an
exception with C<die> && C<return> $value

=back

=head2 BACKWARD COMPATIBILITY ISSUES

If loop control syntax is grafted on for use with code blocks, then
a Perl 5 -> 6 translator would need to explicitly label all loop
blocks and loop control statements.

=head1 IMPLEMENTATION

I haven't a clue. Please feel free to suggest one for the next revision.

=head1 REFERENCES

Too many contributors to note them all down. The content of this RFC owes
heavily to list discussions. The author has simply tried to pick and choose
the best and most easily grasped ideas and analysesfrom the list.

=over 4

=item RFC's that require similar functionality

 RFC 76:  Builtin: reduce

=item Ideas that offer alternative routes to similar functionality:

 RFC 22:  Builtin switch statement
 RFC 31:  Co-routines
 RFC 123: Builtin: lazy 
 RFC 225: Data: Superpositions

=back

Reply via email to