Hi
I'm not sure how to write this email, but please let me try. I'd like to
address something that I see as a limitation in Twisted. It might be
that my use case is odd or that I'm outside the scope of Twisted, but
non the less, I'd hope this could be a relevant topic.
Problem:
Unhandled exceptions can leave the application in a half-working state,
and the in-app observability for them is difficult to obtain. Instead of
terminating the whole application, the rest of the app can still keep
running, and can be completely unaware of the failure.
This applies to unhandled errbacks in Deferred and principally to any
other reactor callbacks. E.g. it can occur in Deferreds being used
internally in Twisted, where direct access to the object isn't available
to the caller.
As a user of Twisted, I would like to have the option to catch or fail
my application completely when these unhandled exceptions occur, as
would be expected in a sequential program.
Background:
I have a larger application using many simultaneous TCP, UDP and UNIX
connections. As with Twisted, the app is grouped in functions, where
most of the heavy lifting are done in black-box-ish modules. There is of
course, no guarantee for everything to work smoothly and if something
fails, the entire application stops as a clear indication of the
failure. However, there have been some occasions where this application
is found to be half-dead, due to a failure occurring in a reactor-based
callback that can only be seen by reading the logs. The main application
is unfortunately unaware of its own failure.
AFAIK Twisted has no direct mechanism for handling errors that might
occur when user code is called from the reactor. Or even worse, the
caller does not know about the occurred failure unless the caller has
direct access to the failing object. I believe this is more dangerous to
reliability than the plain failing applications is, due to lower
observability.
Lets say the following code is used in a running application:
from twisted.internet.task import LoopingCall
class Foo:
def __init__(self):
self.loop = LoopingCall(self.cb)
self.loop.start(2, False)
def cb(self):
self.count += 1
# Main app does this:
try:
foo = Foo()
except:
print "Won't happen"
raise
The code will fail due to the programmical error in cb, but the calling
application won't fail and thinks everything is fine. The methodology in
debugging errors like this is by looking through the logs.
The 0-solution:
Everywhere a function is being called from the reactor, the user is
responsible to handling all exceptions. As is the current case.
However, this is not completely straight forward. try-expect are great
to catch expected errors, but it's easy to forget and ignore the
unexpected ones. Like in the example above. The safeguard for this would
be something like:
def cb(self):
try:
self.count += 1
except:
print "Whoops. Unexpected"
signal_main_app()
And in a large application, there are many entrypoints (e.g. methods in
a protcol handler), so the code becomes very cluttered. Plus it puts the
responsibility for the user to implement the signal_main_app() framework.
Proposal:
The ideal solution would be if there were a way to configure Twisted to
inform about unhandled exceptions. It can be a addSystemEventTrigger(),
or a SW signal, or a process signal, or perhaps a global
execute-last-errback function. Possibly in a debug-context.
With this one could inform the application that one deferred object has
not handled its errbacks. Then the main application is given a choice to
respond appropriately, like shutting down.
Is my concern about the non-observability of unhandled exceptions at all
warranted? Is the thinking wrong? Are there any other types of solutions
to this problem? (I would like to avoid having to patch Twisted to do it.)
Best regards,
Svein
_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python