Reviewed:  https://review.opendev.org/c/openstack/nova/+/874664
Committed: 
https://opendev.org/openstack/nova/commit/84d1f25446731e4e51beb83a017cdf7bfda8c5d5
Submitter: "Zuul (22348)"
Branch:    master

commit 84d1f25446731e4e51beb83a017cdf7bfda8c5d5
Author: Dan Smith <dansm...@redhat.com>
Date:   Tue Feb 21 08:43:13 2023 -0800

    Use mysql memory reduction flags for ceph job
    
    This makes the ceph-multistore job use the MYSQL_REDUCE_MEMORY
    flag in devstack to try to address the frequent OOMs we see in that
    job.
    
    Change-Id: Ibc203bd10dcb530027c2c9f58eb840ccc088280d
    Closes-Bug: #1961068


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1961068

Title:
  nova-ceph-multistore job fails with mysqld got oom-killed

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Searching through the jobs showed that nova-ceph-multistore job fails
  time to time with DB crash due to out of memory error.

  In the tempest errors the following message can be seen:

  tempest.lib.exceptions.ServerFault: Got server fault
  Details: Unexpected API Error. Please report this at 
http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
  <class 'oslo_db.exception.DBConnectionError'>

  in mysqld error logs (controller/logs/mysql/error_log.txt) the crash
  recovery is visible:

  2022-02-15T19:26:40.245179Z 0 [System] [MY-010229] [Server] Starting XA crash 
recovery...
  2022-02-15T19:26:40.268204Z 0 [System] [MY-010232] [Server] XA crash recovery 
finished.

  and around that time in syslog (controller/logs/syslog.txt) the Out of
  Memory logs can be seen:

  Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: 
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/mysql.service,task=mysqld,pid=67959,uid=116
  Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: Out of memory: 
Killed process 67959 (mysqld) total-vm:5127600kB, anon-rss:756064kB, 
file-rss:0kB, shmem-rss:0kB, UID:116 pgtables:2388kB oom_score_adj:0
  Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: oom_reaper: reaped 
process 67959 (mysqld), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

  
  The error only comes in nova-ceph-multistore job. (see recent occurrences via 
logsearch: https://paste.opendev.org/show/bQNKfoaMafUyNFCyQ0kN/ ) Mostly 
happens on current master branch (yoga), but example error found in wallaby as 
well: 
https://zuul.opendev.org/t/openstack/build/d8a6a9c1496346dda6986db00c06a616

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1961068/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to