On Wed, Feb 25, 2015 at 6:11 AM, Jeremy Stanley <[email protected]> wrote:
> On 2015-02-25 01:02:07 +0530 (+0530), Bharat Kumar wrote: > [...] > > After running 971 test cases VM inaccessible for 569 ticks > [...] > > Glad you're able to reproduce it. For the record that is running > their 8GB performance flavor with a CentOS 7 PVHVM base image. The > So we had 2 runs in total in the rax provider VM and below are the results: Run 1) It failed and re-created the OOM. The setup had glusterfs as a storage backend for Cinder. [deepakcs@deepakcs r6-jeremy-rax-vm]$ grep oom-killer run1-w-gluster/logs/syslog.txt Feb 24 18:41:08 devstack-centos7-rax-dfw-979654.slave.openstack.org kernel: mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 Run 2) We *removed glusterfs backend*, so Cinder was configured with the default storage backend i.e. LVM. *We re-created the OOM here too* So that proves that glusterfs doesn't cause it, as its happening without glusterfs too. The VM (104.239.136.99) is now in such a bad shape that existing ssh sessions are no longer responding for a long long time now, tho' ping works. So need someone to help reboot/restart the VM so that we can collect the logs for records. Couldn't find anyone during apac TZ to get it reboot. We managed to get the below grep to work after a long time from another terminal to prove that oom did happen for run2 bash-4.2$ sudo cat /var/log/messages| grep oom-killer Feb 25 08:53:16 devstack-centos7-rax-dfw-979654 kernel: ntpd invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 Feb 25 09:03:35 devstack-centos7-rax-dfw-979654 kernel: beam.smp invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 Feb 25 09:57:28 devstack-centos7-rax-dfw-979654 kernel: mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 Feb 25 10:40:38 devstack-centos7-rax-dfw-979654 kernel: mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 steps to recreate are http://paste.openstack.org/show/181303/ as > discussed in IRC (for the sake of others following along). I've held > a similar worker in HPCloud (15.126.235.20) which is a 30GB flavor > We ran 2 runs in total in the hpcloud provider VM (and this time it was setup correctly with 8g ram, as evident from /proc/meminfo as well as dstat output) Run1) It was successfull. The setup had glusterfs as a storage backend for Cinder. Only 2 testcases failed, they were expected. No oom happened. [deepakcs@deepakcs r7-jeremy-hpcloud-vm]$ grep oom-killer run1-w-gluster/logs/syslog.txt [deepakcs@deepakcs r7-jeremy-hpcloud-vm]$ Run 2) Since run1 went fine, we enabled tempest volume backup testcases too and ran again. It was successfull and no oom happened. [deepakcs@deepakcs r7-jeremy-hpcloud-vm]$ grep oom-killer run2-w-gluster/logs/syslog.txt [deepakcs@deepakcs r7-jeremy-hpcloud-vm]$ > artifically limited to 8GB through a kernel boot parameter. > Hopefully following the same steps there will help either confirm > the issue isn't specific to running in one particular service > provider, or will yield some useful difference which could help > highlight the cause. > So from the above we can conclude that the tests are running fine on hpcloud and not on rax provider. Since the OS (centos7) inside the VM across provider is same, this now boils down to some issue with rax provider VM + centos7 combination. Another data point I could gather is: The only other centos7 job we have is check-tempest-dsvm-centos7 and it does not run full tempest looking at the job's config it only runs smoke tests (also confirmed the same with Ian W) which i believe is a subset of tests only. So that brings to the conclusion that probably cinder-glusterfs CI job (check-tempest-dsvm-full-glusterfs-centos7) is the first centos7 based job running full tempest tests in upstream CI and hence is the first to hit the issue , but on rax provider only thanx, deepak
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
