+1 Providing service crashing information is very valuable. In general we need to provide as much information about why the service exited (critically/traceback/unexpectedly) for our operators.
—Morgan — Morgan Fainberg From: Jay Pipes jaypi...@gmail.com Reply: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: May 28, 2014 at 08:50:25 To: openstack-dev@lists.openstack.org openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] Oslo logging eats system level tracebacks by default On 05/28/2014 11:39 AM, Doug Hellmann wrote: > On Wed, May 28, 2014 at 10:38 AM, Sean Dague <s...@dague.net> wrote: >> When attempting to build a new tool for Tempest, I found that my python >> syntax errors were being completely eaten. After 2 days of debugging I >> found that oslo log.py does the following *very unexpected* thing. >> >> - replaces the sys.excepthook with it's own function >> - eats the execption traceback unless debug or verbose are set to True >> - sets debug and verbose to False by default >> - prints out a completely useless summary log message at Critical >> ([CRITICAL] [-] 'id' was my favorite of these) >> >> This is basically for an exit level event. Something so breaking that >> your program just crashed. >> >> Note this has nothing to do with preventing stack traces that are >> currently littering up the logs that happen at many logging levels, it's >> only about removing the stack trace of a CRITICAL level event that's >> going to very possibly result in a crashed daemon with no information as >> to why. >> >> So the process of including oslo log makes the code immediately >> undebuggable unless you change your config file to not the default. >> >> Whether or not there was justification for this before, one of the >> things we heard loud and clear from the operator's meetup was: >> >> - Most operators are running at DEBUG level for all their OpenStack >> services because you can't actually do problem determination in >> OpenStack for anything < that. >> - Operators reacted negatively to the idea of removing stack traces >> from logs, as that's typically the only way to figure out what's going >> on. It took a while of back and forth to explain that our initiative to >> do that wasn't about removing them per say, but having the code >> correctly recover. >> >> So the current oslo logging behavior seems inconsistent (we spew >> exceptions at INFO and WARN levels, and hide all the important stuff >> with a legitimately uncaught system level crash), undebuggable, and >> completely against the prevailing wishes of the operator community. >> >> I'd like to change that here - https://review.openstack.org/#/c/95860/ >> >> -Sean > > I agree, we should dump as much detail as we can when we encounter an > unhandled exception that causes an app to die. +1 -jay _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev