I've found out that several jobs are exhibiting failures like bug 1254890 [1] and bug 1253896 [2] because openvswitch seem to be crashing the kernel. The kernel trace reports as offending process usually either neutron-ns-metadata-proxy or dnsmasq, but [3] seem to clearly point to ovs-vsctl. 254 events observed in the previous 6 days show a similar trace in the logs [4]. This means that while this alone won't explain all the failures observed, it is however potentially one of the prominent root causes.
>From the logs I have little hints about the kernel running. It seems there has been no update in the past 7 days, but I can't be sure. Openvswitch builds are updated periodically. The last build I found not to trigger failures was the one generated on 2014/01/16 at 01:58:18. Unfortunately version-wise I always have only 1.4.0, no build number. I don't know if this will require getting in touch with ubuntu, or if we can just prep a different image which an OVS build known to work without problems. Salvatore [1] https://bugs.launchpad.net/neutron/+bug/1254890 [2] https://bugs.launchpad.net/neutron/+bug/1253896 [3] http://paste.openstack.org/show/61869/ [4] "kernel BUG at /build/buildd/linux-3.2.0/fs/buffer.c:2917" and filename:syslog.txt On 24 January 2014 21:13, Clay Gerrard <clay.gerr...@gmail.com> wrote: > OH yeah that's much better. I had found those eventually but had to dig > through all that other stuff :'( > > Moving forward I think we can keep an eye on that page, open bugs for > those tests causing issue and dig in. > > Thanks again! > > -Clay > > > On Fri, Jan 24, 2014 at 11:37 AM, Sean Dague <s...@dague.net> wrote: > >> On 01/24/2014 02:02 PM, Peter Portante wrote: >> > Hi Sean, >> > >> > In the last 7 days I see only 6 python27 based test >> > failures: >> http://logstash.openstack.org/#eyJzZWFyY2giOiJwcm9qZWN0Olwib3BlbnN0YWNrL3N3aWZ0XCIgQU5EIGJ1aWxkX3F1ZXVlOmdhdGUgQU5EIGJ1aWxkX25hbWU6Z2F0ZS1zd2lmdC1weXRob24qIEFORCBtZXNzYWdlOlwiRVJST1I6ICAgcHkyNzogY29tbWFuZHMgZmFpbGVkXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjEzOTA1ODk2Mjk0MDR9 >> > >> > And 4 python26 based test >> > failures: >> http://logstash.openstack.org/#eyJzZWFyY2giOiJwcm9qZWN0Olwib3BlbnN0YWNrL3N3aWZ0XCIgQU5EIGJ1aWxkX3F1ZXVlOmdhdGUgQU5EIGJ1aWxkX25hbWU6Z2F0ZS1zd2lmdC1weXRob24qIEFORCBtZXNzYWdlOlwiRVJST1I6ICAgcHkyNjogY29tbWFuZHMgZmFpbGVkXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjEzOTA1ODk1MzAzNTd9 >> > >> > Maybe the query you posted captures failures where the job did not even >> run? >> > >> > And only 15 hits (well, 18, but three are within the same job, and some >> > of the tests are run twice, so it is a combined of 10 >> > hits): >> http://logstash.openstack.org/#eyJzZWFyY2giOiJwcm9qZWN0Olwib3BlbnN0YWNrL3N3aWZ0XCIgQU5EIGJ1aWxkX3F1ZXVlOmdhdGUgQU5EIGJ1aWxkX25hbWU6Z2F0ZS1zd2lmdC1weXRob24qIEFORCBtZXNzYWdlOlwiRkFJTDpcIiBhbmQgbWVzc2FnZTpcInRlc3RcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTM5MDU4OTg1NTAzMX0= >> > >> > >> > Thanks, >> >> So it is true, that the Interupted exceptions (which is when a job is >> killed because of a reset) are some times being turned into Fail events >> by the system, which is one of the reasons the graphite data for >> failures is incorrect, and if you use just the graphite sourcing for >> fails, your numbers will be overly pessimistic. >> >> The following is probably better lists >> - >> >> http://status.openstack.org/elastic-recheck/data/uncategorized.html#gate-swift-python26 >> (7 uncategorized fails) >> - >> >> http://status.openstack.org/elastic-recheck/data/uncategorized.html#gate-swift-python27 >> (5 uncategorized fails) >> >> -Sean >> >> -- >> Sean Dague >> Samsung Research America >> s...@dague.net / sean.da...@samsung.com >> http://dague.net >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev