Hey all, We've made some progress with the gates this past week. There are still some issues, but I want to point out that I've also seen a lot of real errors get a recheck comment recently. It slows the gate down and wastes infra quota to recheck things that are going to fail again. Can I suggest that we all make sure to get back in the habit of looking at failures and noting down a reason for the recheck? This will also help track what issues still remain to be fixed with the gates.
Thanks, Corey On Mon, Feb 8, 2016 at 12:10 PM Hongbin Lu <hongbin...@huawei.com> wrote: > Hi Team, > > > > In order to resolve issue #3, it looks like we have to significantly > reduce the memory consumption of the gate tests. Details can be found in > this patch https://review.openstack.org/#/c/276958/ . For core team, a > fast review and approval of that patch would be greatly appreciated, since > it is hard to work with a gate that takes several hours to complete. Thanks. > > > > Best regards, > > Hongbin > > > > *From:* Corey O'Brien [mailto:coreypobr...@gmail.com] > *Sent:* February-05-16 12:04 AM > > > *To:* OpenStack Development Mailing List (not for usage questions) > > *Subject:* [openstack-dev] [Magnum] gate issues > > > > So as we're all aware, the gate is a mess right now. I wanted to sum up > some of the issues so we can figure out solutions. > > > > 1. The functional-api job sometimes fails because bays timeout building > after 1 hour. The logs look something like this: > > magnum.tests.functional.api.v1.test_bay.BayTest.test_create_list_and_delete_bays > [3733.626171s] ... FAILED > > I can reproduce this hang on my devstack with etcdctl 2.0.10 as described > in this bug (https://bugs.launchpad.net/magnum/+bug/1541105), but > apparently either my fix with using 2.2.5 ( > https://review.openstack.org/#/c/275994/) is incomplete or there is > another intermittent problem because it happened again even with that fix: ( > http://logs.openstack.org/94/275994/1/check/gate-functional-dsvm-magnum-api/32aacb1/console.html > ) > > > > 2. The k8s job has some sort of intermittent hang as well that causes a > similar symptom as with swarm. > https://bugs.launchpad.net/magnum/+bug/1541964 > > > > 3. When the functional-api job runs, it frequently destroys the VM causing > the jenkins slave agent to die. Example: > http://logs.openstack.org/03/275003/6/check/gate-functional-dsvm-magnum-api/a9a0eb9//console.html > <http://logs.openstack.org/03/275003/6/check/gate-functional-dsvm-magnum-api/a9a0eb9/console.html> > > When this happens, zuul re-queues a new build from the start on a new VM. > This can happen many times in a row before the job completes. > > I chatted with openstack-infra about this and after taking a look at one > of the VMs, it looks like memory over consumption leading to thrashing was > a possible culprit. The sshd daemon was also dead but the console showed > things like "INFO: task kswapd0:77 blocked for more than 120 seconds". A > cursory glance and following some of the jobs seems to indicate that this > doesn't happen on RAX VMs which have swap devices unlike the OVH VMs as > well. > > > > 4. In general, even when things work, the gate is really slow. The > sequential master-then-node build process in combination with underpowered > VMs makes bay builds take 25-30 minutes when they do succeed. Since we're > already close to tipping over a VM, we run functional tests with > concurrency=1, so 2 bay builds means almost the entire allotted devstack > testing time (generally 75 minutes of actual test time available it seems). > > > > Corey > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev