Public bug reported: I'm looking at conductor logs generated by a customer running RH OSP 10 (Newton). The logs appear to be corrupt in a manner I'd expect to see if 2 processes were writing to the same log file simultaneously. For example:
=== 2017-09-14 15:54:39.689 120626 ERROR nova.servicegroup.drivers.db return self.dbapi.connect(*cargs, **cparams) 2017-09-14 15:54:39.689 120626 ERROR nova.s2017-09-14 15:54:39.690 120562 ERROR nova.servicegroup.drivers.db [-] Unexpected error while reporting service status 2017-09-14 15:54:39.690 120562 ERROR nova.servicegroup.drivers.db Traceback (most recent call last): === Notice how a new log starts part way through the second line above. This also results in log entries in the wrong sort order: === 2017-09-14 15:54:39.690 120562 ERROR nova.servicegroup.drivers.db return self.dbapi.connect(*cargs, **cparams) 2017-09-14 15:54:39.690 120562 ERROR nova.servicegroup.drivers.db File "/usr/lib/python2.7/site-packages/pymysql/__init__.py", line 88, in Connect 2017-09-14 15:54:39.689 120626 ERROR nova.servicegroup.drivers.db return Connection(*args, **kwargs) 2017-09-14 15:54:39.689 120626 ERROR nova.servicegroup.drivers.db File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 657, in __init__ === Note how the first 2 lines are after the last 2 by timestamp, as presumably the last 2 are a continuation of a previous log entry. This confounds merge sorting of log files, which is exceptionally useful. We also see truncated lines with no timestamp which aren't a continuation of the previous line: === 2017-09-14 15:54:39.690 120607 ERROR nova.servicegroup.drivers.db DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1'] 2017-09-14 15:54:39.690 120607 ERROR nova.servicegroup.drivers.db elf._execute_and_instances(context) === I strongly suspect this is because multiple conductors are running in separate processes, and are therefore not benefiting from the thread safety of python's logging. ** Affects: nova Importance: Undecided Status: New ** Summary changed: - Apparent lack of locking in logger + Apparent lack of locking in conductor logs -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1718439 Title: Apparent lack of locking in conductor logs Status in OpenStack Compute (nova): New Bug description: I'm looking at conductor logs generated by a customer running RH OSP 10 (Newton). The logs appear to be corrupt in a manner I'd expect to see if 2 processes were writing to the same log file simultaneously. For example: === 2017-09-14 15:54:39.689 120626 ERROR nova.servicegroup.drivers.db return self.dbapi.connect(*cargs, **cparams) 2017-09-14 15:54:39.689 120626 ERROR nova.s2017-09-14 15:54:39.690 120562 ERROR nova.servicegroup.drivers.db [-] Unexpected error while reporting service status 2017-09-14 15:54:39.690 120562 ERROR nova.servicegroup.drivers.db Traceback (most recent call last): === Notice how a new log starts part way through the second line above. This also results in log entries in the wrong sort order: === 2017-09-14 15:54:39.690 120562 ERROR nova.servicegroup.drivers.db return self.dbapi.connect(*cargs, **cparams) 2017-09-14 15:54:39.690 120562 ERROR nova.servicegroup.drivers.db File "/usr/lib/python2.7/site-packages/pymysql/__init__.py", line 88, in Connect 2017-09-14 15:54:39.689 120626 ERROR nova.servicegroup.drivers.db return Connection(*args, **kwargs) 2017-09-14 15:54:39.689 120626 ERROR nova.servicegroup.drivers.db File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 657, in __init__ === Note how the first 2 lines are after the last 2 by timestamp, as presumably the last 2 are a continuation of a previous log entry. This confounds merge sorting of log files, which is exceptionally useful. We also see truncated lines with no timestamp which aren't a continuation of the previous line: === 2017-09-14 15:54:39.690 120607 ERROR nova.servicegroup.drivers.db DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1'] 2017-09-14 15:54:39.690 120607 ERROR nova.servicegroup.drivers.db elf._execute_and_instances(context) === I strongly suspect this is because multiple conductors are running in separate processes, and are therefore not benefiting from the thread safety of python's logging. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1718439/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp