We've just rolled out a new version of our process manager, with streamlined handling of crashed processes. A lot of you guys have been frustrated when your workers crash (libXML segfaults, for example) because the restart behavior was somewhat unpredictable.
The new version follows a very simple algorithm: any process that crashes will be restarted immediately, with a cooldown time of ten minutes. This applies to all process types including web processes (dynos). We've also unified the behavior of crashed dynos with crashed workers. Now if you push up a bad deploy (missing a gem, etc) you won't be shown the stack trace in the browser, and instead can find the stack trace and debugging info by running "heroku logs". This makes the workflow for handling a crashed process the same as handling an exception during the request cycle (HTTP 500): your users are displayed a generic page, and you the developer can look up the problem in your logs. A description of the new crash policy, and some documentation for the "heroku ps" command (great for introspecting on the state of your processes) can be found here: http://docs.heroku.com/ps We welcome your feedback. Please reply (here on the mailing list, or to me privately) if you have any thoughts, especially if you've seen it in action on your own crashed processes. -- You received this message because you are subscribed to the Google Groups "Heroku" group. To post to this group, send email to her...@googlegroups.com. To unsubscribe from this group, send email to heroku+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/heroku?hl=en.