Reviewed: https://review.opendev.org/c/openstack/nova/+/874664 Committed: https://opendev.org/openstack/nova/commit/84d1f25446731e4e51beb83a017cdf7bfda8c5d5 Submitter: "Zuul (22348)" Branch: master
commit 84d1f25446731e4e51beb83a017cdf7bfda8c5d5 Author: Dan Smith <dansm...@redhat.com> Date: Tue Feb 21 08:43:13 2023 -0800 Use mysql memory reduction flags for ceph job This makes the ceph-multistore job use the MYSQL_REDUCE_MEMORY flag in devstack to try to address the frequent OOMs we see in that job. Change-Id: Ibc203bd10dcb530027c2c9f58eb840ccc088280d Closes-Bug: #1961068 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1961068 Title: nova-ceph-multistore job fails with mysqld got oom-killed Status in OpenStack Compute (nova): Fix Released Bug description: Searching through the jobs showed that nova-ceph-multistore job fails time to time with DB crash due to out of memory error. In the tempest errors the following message can be seen: tempest.lib.exceptions.ServerFault: Got server fault Details: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. <class 'oslo_db.exception.DBConnectionError'> in mysqld error logs (controller/logs/mysql/error_log.txt) the crash recovery is visible: 2022-02-15T19:26:40.245179Z 0 [System] [MY-010229] [Server] Starting XA crash recovery... 2022-02-15T19:26:40.268204Z 0 [System] [MY-010232] [Server] XA crash recovery finished. and around that time in syslog (controller/logs/syslog.txt) the Out of Memory logs can be seen: Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/mysql.service,task=mysqld,pid=67959,uid=116 Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: Out of memory: Killed process 67959 (mysqld) total-vm:5127600kB, anon-rss:756064kB, file-rss:0kB, shmem-rss:0kB, UID:116 pgtables:2388kB oom_score_adj:0 Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: oom_reaper: reaped process 67959 (mysqld), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB The error only comes in nova-ceph-multistore job. (see recent occurrences via logsearch: https://paste.opendev.org/show/bQNKfoaMafUyNFCyQ0kN/ ) Mostly happens on current master branch (yoga), but example error found in wallaby as well: https://zuul.opendev.org/t/openstack/build/d8a6a9c1496346dda6986db00c06a616 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1961068/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp