On Apr 8, 2006, at 19:49, Bob Rogers wrote:
  . . .

   =item *

C<push_eh> creates an exception handler and pushes it onto the control stack. It takes a label (the location of the exception handler) as its only argument. [Is this right? Treating exception handlers as label
   jumps rather than full subroutines is error-prone.]

They are not "jumps" but continuations, so in a sense they are more
general than subs, which don't have prior state.

Right, a continuation taken on the address of a label in the current compilation unit.

HLL exception handlers on the other hand, are likely to be written as independent subroutines, much like the current signal handlers in Perl 5. An exception handler is closer to an event handler than it is to a return continuation. (The design choice is between having exception handlers that are complete compilation units, or just code segments. Both are valid options. And it may be that we want to support both.)

The "error-prone" comment has to do with control flow. The effect of the current implementation is that when the interpreter catches an exception, it dumps control flow at the label that was captured in the continuation. Any control flow after that is the responsibility of the developer, and it's easy to get it wrong.

It might be more helpful if the continuation taken was a return continuation: where to return to if an exception is caught and successfully handled.


   =item *

C<pushaction> pushes a subroutine object onto the control stack. If the
   control stack is unwound due to an exception (or C<popmark>, or
subroutine return), the subroutine is invoked with an integer argument: C<0> means a normal return; C<1> means an exception has been raised.
   [Seems like there's lots of room for dangerous collisions here.]

I'm not sure what you mean by "collisions" here, nor why you think they
would be dangerous.

Specifically, because the control stack is used for multiple different things, it's easy to get into a situation where the thing you're popping off the stack isn't what you meant to pop off the stack. It's one of the reasons we aren't using stack-based control flow through most of Parrot.

Arguably, C<pushaction> is too simplistic; it
doesn't provide for such things as the repeated exit-and-reenter
behavior of coroutines, and there is no mechanism to specify a thunk
that gets called when *entering* a dynamic context . . .

That too.

   =back

   =head1 IMPLEMENTATION

   [I'm not convinced the control stack is the right way to handle
exceptions. Most of Parrot is based on the continuation-passing style of
   control, shouldn't exceptions be based on it too? See bug #38850.]

Seems to me there isn't any real choice. Exception handlers are part of
the dynamic context, and dynamic contexts nest in such a way as to
behave like a stack.  Even pure CPS implementations that want to
maintain dynamic state have to create an explicit stack in a global
variable somewhere.

"dynamic contexts nest in such a way as to behave like a stack" is true, but not necessarily the same thing as storing all exception handlers on a single global stack that's also used for primitive control flow.

Let's take the example of something that recently came up: asynchronous I/O with exceptions. The current implementation says: push a global exception handler onto the stack, call the routine that might throw an exception, then pop the exception handler off the stack. But with asynchronous I/O, the exception handler is likely to be popped off the stack long before the async call throws an exception. Or, if you delay popping off the exception handler until the async callback is called, then you may have other exception handlers pushed onto the stack in the mean time (possibly exception handlers for other async calls).

In theory, the return continuation maintains the state of the caller's control stack, so you can invoke return continuations up the CPS chain until you reach a dynamic context where the exception is handled. But where does control flow go after you handle an exception from an async op?

Other opcodes respond to an C<errorson> setting to decide whether to throw an exception or return an error value. C<find_global> throws an exception (or returns a Null PMC) if the global name requested doesn't exist. C<find_name> throws an exception (or returns a Null PMC) if the name requested doesn't exist in a lexical, current, global, or built-in
   namespace.

It's a little odd that so few opcodes throw exceptions (these are the ones that are documented, but a few others throw exceptions internally even though they aren't documented as doing so). It's worth considering
   either expanding the use of exceptions consistently throughout the
opcode set, or eliminating exceptions from the opcode set entirely. The strategy for error handling should be consistent, whatever it is. [I like the way C<LexPad>s and the C<errorson> settings provide the option for exception-based or non-exception-based implementations, rather than
   forcing one or the other.]

This have-your-cake-and-eat-it-too (HYCAEIT?) strategy sounds good in
theory, but may be dangerous in practice. Which style of error handling
a given piece of code uses is a static property of the way the code is
written. On the other hand, C<errorson> is dynamic and global. If one
of the modules you use wants to do error handling by checking return
values, but another module doesn't check returns because it expects
errors to be signalled, then no C<errorson> setting will satisfy both,
regardless of how you want to design *your* code.

Maybe we need a non-global equivalent of these options.

   I personally prefer exception-based error handling, since it scales
better.  I have been acting on this when the opportunity arises,
changing internal_exception calls to real_exception when it makes sense,
and when I'm mucking around in that code anyway.  (A good example of
this is "No exception to pop", come to think of it.) It is also helpful
to get a backtrace when something fails.

Backtracing can be enabled without exceptions.

   =head2 Excerpt

   [Excerpt from "Perl 6 and Parrot Essentials" to seed discussion.
Out-of-date in some ways, and in others it was simply speculative.]

For everything below this point, keep in mind that the text was written in 2004.

Exceptions provide a way of calling a piece of code outside the normal flow of control. They are mainly used for error reporting or cleanup tasks, but sometimes exceptions are just a funny way to branch from
   one code location to another one.

Exceptions are objects that hold all the information needed to handle the exception: the error message, the severity and type of the error, etc. The class of an exception object indicates the kind of exception
   it is.

Exception handlers are derived from continuations. They are ordinary subroutines that follow the Parrot calling conventions, but are never
   explicitly called from within user code.

Not quite true; a Continuation is not a Sub, though it can be invoked
like one.

This is one of the "out-of-date" bits.

   thrown. The handler has to examine the exception object and decide
   whether it can handle it (or discard it) or whether it should
   C<rethrow> the exception to pass it along to an exception handler
deeper in the stack. The C<rethrow> opcode is only valid in exception handlers. It pushes the exception object back onto the control stack so Parrot knows to search for the next exception handler in the stack. The

This is not correct; exception objects are never pushed onto the control stack. And the exception handler itself is popped off the control stack
before it is invoked.

Another out-of-date bit. It was one way we considered implementing it (and still worth keeping in mind).

process continues until some exception handler deals with the exception and returns normally, or until there are no more exception handlers on the control stack. When the system finds no installed exception handlers
   it defaults to a final action, which normally means it prints an
   appropriate message and terminates the program.

Currently it also prints a backtrace, which is really nice.  Alas, the
backtrace is only from the point of the final rethrow by the oldest
(bottommost) exception handler. This is the greatest weakness with the current Parrot exception-handling design: By the time you find out that a given exception is unhandled, the dynamic environment of the C<throw>
has been destroyed by the very process of searching for a willing
handler. This makes it extremely difficult to write a debugger than can
do anything useful about uncaught exceptions.

Exception handler tracing is a useful feature, and is worth adding if it doesn't cost too much (in terms of implementation complexity, execution speed, etc).

   When the system installs an exception handler, it creates a return
continuation with a snapshot of the current interpreter context. If

This is confusing; I assume you are talking about the Exception_Handler
itself and not a RetContinuation.

In this context, no. It really meant a return continuation.

   the exception handler just returns (that is, if the exception is
   cleanly caught) the return continuation restores the control stack
back to its state when the exception handler was called, cleaning up
   the exception handler and any other changes that were made in the
   process of handling the exception.

Hmm.  It seems that an exception is "cleanly caught" only if it is not
rethrown.  It is therefore not possible to tell by looking at the
exception itself whether or not it is "cleanly caught" or if it is still
in the process of being signalled.

For the most part, exceptions are likely to be discarded soon after they're caught (and garbage collected at some point after that). But, marking exceptions as "caught" may be a cheap way of tracking the history of how a particular exception was handled. And if we do decide to have resumable exceptions, that sort of information may be immediately useful.

Exceptions thrown by standard Parrot opcodes (like the one thrown by C<find_global> above or by the C<throw> opcode) are always resumable, so when the exception handler function returns normally it continues
   execution at the opcode immediately after the one that threw the
exception. Other exceptions at the run-loop level are also generally
   resumable.

You seem to want to say that unhandled exceptions are ignored. Is that
correct?  If so, I see several problems:

   1.  What is "the exception handler function" and how is it
distinguished from the function that established the exception handler?
[It sounds like you are expecting the exception handler to behave more
like a closure than a continuation . . . ]

An "exception handler function" would be an exception handler that is a complete compilation unit rather than just a code segment inside some other compilation unit.

   2.  The previous paragraph says that if "the exception handler just
returns", that means that "the exception is cleanly caught". Unless you
want to propose a new mechanism, the only way a handler can decline to
handle an exception is by rethrowing it, which precludes the possibility
of resuming.

The current prototype implementation doesn't support resumable exceptions, it's true. But, resumable exceptions are a useful feature, and one that we originally planned for Parrot. Before we throw out the baby with the bath water, we need to first look at what it will take to build in resumable exceptions. It's possible that an architecture that supports resumable exceptions may be a better architecture overall.

   3.  Shouldn't unhandled exceptions either enter the debugger if
interactive, else die?  Ignoring the fact that an opcode failed, like
ignoring the fact that anything else failed, seems dangerous . . .

     new P10, Exception            # create new Exception object
     set P10["_message"], "I die"  # set message attribute
     throw P10                     # throw it

There are different levels of severity in exceptions. Some are necessarily fatal. Some aren't. For example, some languages treat the "end of file" condition as a non-fatal exception.

Exceptions are designed to work with the Parrot calling conventions. Since the return addresses of C<bsr> subroutine calls and exception handlers are both pushed onto the control stack, it's generally a bad
   idea to combine the two.

How about replacing this with the following:

   . . . exception
   handlers are both pushed onto the control stack, care must be taken
   to nest them properly, i.e. by removing error handlers established
   after C<bsr> before the corresponding C<ret>.

After all, it works as long as the user plays by the rules.

We can define any set of rules for exceptions (or calling conventions, or any other Parrot subsystem) and expect users to follow them, but some sets of rules are more prone to user error than others. Our job as designers and implementors is to examine the options and choose the set of rules that is most stable, robust, maintainable, and (as much as possible) user-friendly.

Allison

Reply via email to