Re: Crashed task is not reaped

2014-08-28 Thread Alex Rukletsov
Hi Brian,

thanks for the answer. This sounds reasonable, it would be nice to somehow
enforce this crash-if-fail behaviour in client executors, but it seems
barely possible.

Alex


On Thu, Aug 28, 2014 at 12:56 AM, Brian Wickman wick...@apache.org wrote:

 A crashed thread does not terminate the Python interpreter, so the
 executor here will stay alive.  If you want an abnormal thread exit to
 result in an executor termination, you will have to implement that behavior
 explicitly.  We use a thread liveness detector that looks something like:
 https://gist.github.com/wickman/dc11896d782f9a2160b8

 When you create a thread, you do registry.register(thread).  That thread
 should call registry.unregister(self) prior to terminating normally.  If it
 terminates abnormally, the registry.dead event will be set.  Our MainThread
 in practice just does something like:

 while registry.dead.wait(timeout=10):
   pass

 We also have a library (twitter.common.exceptions on pypi) that provides a
 class called ExceptionalThread which guarantees that sys.excepthook() is
 called.  You could implement similar behavior by making all threads
 ExceptionalThreads and wrapping sys.excepthook with something that provides
 an event to MainThread to signal termination as described above.

 ~brian


 On Wed, Aug 27, 2014 at 2:55 PM, Alex Rukletsov a...@mesosphere.io
 wrote:

  While playing with Rendler https://github.com/mesosphere/RENDLER I
  noticed that if the task (read: python executor) crashes, the underlying
  executor stays alive and therefore is not reaped, which renders the task
  running indefinitely. Here
  https://gist.github.com/rukletsov/4a74743c5b67f304e661 is a part of
 the
  slave log (exception itself doesn't matter, it's there to test the
  behaviour). Not sure, whether it's a bug or a feature, for me it looks
 like
  a bug.
 
  Regards,
  Alex
 



Crashed task is not reaped

2014-08-27 Thread Alex Rukletsov
While playing with Rendler https://github.com/mesosphere/RENDLER I
noticed that if the task (read: python executor) crashes, the underlying
executor stays alive and therefore is not reaped, which renders the task
running indefinitely. Here
https://gist.github.com/rukletsov/4a74743c5b67f304e661 is a part of the
slave log (exception itself doesn't matter, it's there to test the
behaviour). Not sure, whether it's a bug or a feature, for me it looks like
a bug.

Regards,
Alex


Re: Crashed task is not reaped

2014-08-27 Thread Brian Wickman
A crashed thread does not terminate the Python interpreter, so the
executor here will stay alive.  If you want an abnormal thread exit to
result in an executor termination, you will have to implement that behavior
explicitly.  We use a thread liveness detector that looks something like:
https://gist.github.com/wickman/dc11896d782f9a2160b8

When you create a thread, you do registry.register(thread).  That thread
should call registry.unregister(self) prior to terminating normally.  If it
terminates abnormally, the registry.dead event will be set.  Our MainThread
in practice just does something like:

while registry.dead.wait(timeout=10):
  pass

We also have a library (twitter.common.exceptions on pypi) that provides a
class called ExceptionalThread which guarantees that sys.excepthook() is
called.  You could implement similar behavior by making all threads
ExceptionalThreads and wrapping sys.excepthook with something that provides
an event to MainThread to signal termination as described above.

~brian


On Wed, Aug 27, 2014 at 2:55 PM, Alex Rukletsov a...@mesosphere.io wrote:

 While playing with Rendler https://github.com/mesosphere/RENDLER I
 noticed that if the task (read: python executor) crashes, the underlying
 executor stays alive and therefore is not reaped, which renders the task
 running indefinitely. Here
 https://gist.github.com/rukletsov/4a74743c5b67f304e661 is a part of the
 slave log (exception itself doesn't matter, it's there to test the
 behaviour). Not sure, whether it's a bug or a feature, for me it looks like
 a bug.

 Regards,
 Alex