Re: Null references redux

Jeremie Pelletier Tue, 29 Sep 2009 12:05:10 -0700

Sean Kelly wrote:

== Quote from Jeremie Pelletier (jerem...@gmail.com)'s article

Andrei Alexandrescu wrote:

Jeremie Pelletier wrote:

Is this Linux specific? what about other *nix systems, like BSD and
solaris?

Signal handler are standard to most *nix platforms since they're part
of the posix C standard libraries, maybe some platforms will require a
special handling but nothing impossible to do.

Let me write a message on behalf of Sean Kelly. He wrote that to Walter
and myself this morning, then I suggested him to post it but probably he
is off email for a short while. Hopefully the community will find a
solution to the issue he's raising. Let me post this:


===================
Sean Kelly wrote:

There's one minor problem with his code.  It's not safe to throw an
exception from a signal handler.  Here's a quote from the POSIX spec at
opengroup.org:

"In order to prevent errors arising from interrupting non-reentrant
function calls, applications should protect calls to these functions
either by blocking the appropriate signals or through the use of some
programmatic semaphore (see semget() , sem_init() , sem_open() , and so
on). Note in particular that even the "safe" functions may modify errno;
the signal-catching function, if not executing as an independent thread,
may want to save and restore its value. Naturally, the same principles
apply to the reentrancy of application routines and asynchronous data
access. Note thatlongjmp() and siglongjmp() are not in the list of
reentrant functions. This is because the code executing after longjmp()
and siglongjmp() can call any unsafe functions with the same danger as
calling those unsafe functions directly from the signal handler.
Applications that use longjmp() andsiglongjmp() from within signal
handlers require rigorous protection in order to be portable."

If this were an acceptable approach it would have been in druntime ages
ago :-)
===================

Yes but the segfault signal handler is not made to design code that can
live with these exceptions, its just a feature to allow segfaults to be
sent to the crash handler to get a backtrace dump. Even on windows while
you can recover from access violations, its generally a bad idea to
allow for bugs to be turned into features.


I don't think it's fair to compare Windows to Unix here because, as far as
I know, Windows (ie. Win32, etc) was built with exceptions in mind (thanks to
SEH), while Unix was not.  So while the Windows kernel may theoretically be fine
with an exception being thrown from within kernel code, this isn't true of Unix.

It's true that as long as only Errors are thrown (and thus that the app intends
to terminate), things aren't as bad as they could be.  Worst case, some mutex
in libc is left locked or in some weird state and code executed during stack
unwinding or when trying to report the error causes the app to hang instead
of terminate.  And this risk is somewhat mitigated because I'd expect most
of these errors to occur within user code anyway.

One thing I'm not entirely sure about is whether the signal handler will always
have a valid, C-style call stack tracing back into user code.  These errors are
triggered by hardware, and I really don't know what kind of tricks are common
at that level of OS code.  longjmp() doesn't have this problem because it 
doesn't
care about the call stack--it just swaps some registers and executes a JMP.  I
don't suppose anyone here knows more about the feasibility of throwing
exceptions from signal handlers at all?  I'll ask around some OS groups and
see what people say.

I haven't had any problems so far, the stack trace generated was alwaysvalid and similar to what gdb would output. But I agree that trying torecover from these exceptions is a *bad* idea in so many ways.

From what I know, the kernel alters the stack frame of the signalhandler to make us believe we called it ourselves. Returning from thesignal handler therefore jumps to the routine from which the signal wasoriginally raised, without the kernel being aware of it.

This is a bit different than how SEH is handled, but has a lot in commonto it:

From the research I did about SEH internals, its just built on top ofinterrupt handlers. The hardware raises an exception (access violation,etc), jumps into a kernel handler for the corresponding interrupt, itthere looks up the base of the stack for a pointer to a structcontaining a handler function and a handler table which is set andrestored by try blocks and calls the exception handler (_d_framehandlerin our case) with the appropriate parameters. From there the kerneldecides what to do based on the return code of the framehandler.

The signal handler model is therefore quite acceptable to buildexception handling on top of. We just may want to also manually generatea core dump before throwing the exception to support postmortem debugging.

Re: Null references redux

Reply via email to