Re: [openstack-dev] [ironic] [third-party-ci] pkvmci ironic job breakage details
Dell - Internal Use - Confidential Hi Mike I do see this problem this week. See logs. https://stash.opencrowbar.org/logs/52/456952/2/check/dell-hw-tempest-dsvm-ironic-pxe_ipmitool/315bd85/ We were running 5 builds on a one node devstack-cloud and now made it 6 and started seeing this problem. The server must be running out of resources for the VMs. Regards Rajini -Original Message- From: Michael Turek [mailto:mjtu...@linux.vnet.ibm.com] Sent: Friday, April 14, 2017 10:52 AM To: OpenStack Development Mailing List (not for usage questions) Subject: [openstack-dev] [ironic] [third-party-ci] pkvmci ironic job breakage details Hey ironic-ers, So our third party CI job for ironic has been, and remains, broken. I was able to do some investigation today and here's a summary of what we're seeing. I'm hoping someone might know the root of the problem. For reference, please see this paste and the logs of the job that I was working in: http://paste.openstack.org/show/606564/ https://dal05.objectstorage.softlayer.net/v1/AUTH_3d8e6ecb-f597-448c-8ec2-164e9f710dd6/pkvmci/ironic/25/454625/10/check-ironic/tempest-dsvm-ironic-agent_ipmitool/0520958/ I've redacted the credentials in the ironic node-show for obvious reasons but rest assured they are properly set. These commands are run while '/opt/stack/new/ironic/devstack/lib/ironic:wait_for_nova_resources' is looping. Basically, the ironic hypervisor for the node doesn't appear. As well, none of the node's properties make it to the hypervisor stats. Some more strangeness is that the 'count' value from the 'openstack hypervisor stats show'. Though no hypervisors appear, the count is still 1. Since the run was broken, I decided to delete node-0 (about 3-5 minutes before the run failed) and see if it updated the count. It did. Does anyone have any clue what might be happening here? Any advice would be appreciated! Thanks, mjturek __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ironic] 3rdparty CI status and how we can help to make it green
Dell - Internal Use - Confidential Thanks for starting the discussion. Will attend -Original Message- From: Dmitry Tantsur [mailto:dtant...@redhat.com] Sent: Thursday, April 13, 2017 10:33 AM To: OpenStack Development Mailing List (not for usage questions) Subject: [openstack-dev] [ironic] 3rdparty CI status and how we can help to make it green Hi all, especially maintainers of 3rdparty CI for Ironic :) I've been watching our 3rdparty CI results recently. While things have improved compared to e.g. a month ago, most of jobs still finish with failures. I've written a simple script [1] to fetch CI runs information from my local Gertty database, the results [2] show that some jobs still fail surprisingly often (> 50% of cases): - job: tempest-dsvm-ironic-agent-irmc rate: 0.9857142857142858 - job: tempest-dsvm-ironic-iscsi-irmc rate: 0.9771428571428571 - job: dell-hw-tempest-dsvm-ironic-pxe_drac rate: 0.9682539682539683 - job: gate-tempest-ironic-ilo-driver-iscsi_ilo rate: 0.9582463465553236 - job: dell-hw-tempest-dsvm-ironic-pxe_ipmitool rate: 0.9111 - job: tempest-dsvm-ironic-pxe-irmc rate: 0.8171428571428572 - job: gate-tempest-ironic-ilo-driver-pxe_ilo rate: 0.791231732776618 I would like to start the discussion on how we (as a team) can help people maintaining the CI to keep failure rate closer to one of our virtual CI (< 30% of cases, judging by [2]). I'm thinking of the following potential problems: 1. Our devstack plugin changes too often. I've head this complaint at least once. Should we maybe freeze our devstack at some point to allow the vendor folks to catch up? Then we should start looking at the CI results more carefully when modifying it. 2. Our devstack plugin is inconvenient for hardware, and requires hacks. This is something Miles (?) told me when trying to set up an environment for his hardware lab. If so, can we get a list of pain problems, preferably in a form of reported bugs? Myself and hopefully other folks can certainly dedicate some time to make your life easier. 3. The number of jobs to run on is too high. I've noticed that 3rdparty CI runs even on patches that clearly don't require it, e.g. docs-only changes. I suggest the maintainers to adopt some exclude rules similar to [3]. Also, most of the vendors run 3-4 jobs for different flavors of their drivers (and it is going to increase with the driver composition work). I wonder if we should recommend switching from ironic the baremetal_basic_ops test to what we call "standalone" tests [4]. This will allow to have only one job testing several drivers/combinations of interfaces within the same time frame. Finally, I've proposed this topic for the virtual meetup [5] planned in the end of April. Please feel free to stop by and let us know how we can help. Thanks, Dmitry. P.S. I've seen expired or self-signed HTTPS certificates on logs sites of some 3rdparty CI. Please try to fix such issues as soon as possible to allow the community to understand failures. [1] https://github.com/dtantsur/ci-report [2] http://paste.openstack.org/show/606467/ [3] https://github.com/openstack-infra/project-config/blob/master/zuul/layout.yaml#L1375-L1385 [4] https://github.com/openstack/ironic/blob/master/ironic_tempest_plugin/tests/scenario/ironic_standalone/test_basic_ops.py [5] https://etherpad.openstack.org/p/ironic-virtual-meetup __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Tripleo]FFE to update Eqlx and DellSc cinder templates
Dell - Internal Use - Confidential I would like to request a freeze exception to update Dell EqualLogic and Dell Storage Center cinder backend templates to use composable roles and services in Triple-o. This work is done and pending merge for the past few weeks. Without these we won't be able to do upgrades. Dell Eqlx: https://review.openstack.org/#/c/422238/ Dell Storage Center: https://review.openstack.org/#/c/425866/ Thank you Rajini __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Tripleo] FFE to add ScaleIO to Triple-o
Dell - Internal Use - Confidential I would like to request a feature freeze exception to ScaleIO cinder backend support Triple-o. This work is done and pending review for the past three weeks. The puppet-cinder work is already merged. Pending review https://review.openstack.org/#/c/422238/ Thanks Rajini __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ironic] weekly subteam report
Dell - Internal Use - Confidential There is a related patch to the neutron subnet pool fix, This allows you to use FIXED_RANGE without using subnet pool https://review.openstack.org/#/c/378063/ -Original Message- From: Loo, Ruby [mailto:ruby@intel.com] Sent: Monday, October 17, 2016 1:36 PM To: OpenStack Development Mailing List (not for usage questions) Subject: [openstack-dev] [ironic] weekly subteam report Hi, Here is this week's subteam report for Ironic. As usual, this is pulled directly from the Ironic whiteboard[0] and formatted. Bugs (dtantsur) === - Stats (diff between 10 Oct 2016 and 17 Oct 2016) - Ironic: 224 (+8) bugs + 209 wishlist items (-3). 23 new (+8), 171 in progress (-2), 1 critical, 30 high and 20 incomplete (+1) - Inspector: 10 bugs (-1) + 19 wishlist items. 1 new, 9 in progress (-1), 0 critical, 1 high and 1 incomplete - Nova bugs with Ironic tag: 11 (+1). 1 new (+1), 0 critical, 1 high - the critical bug is the neutron pool issue (has workaround in place right now): - https://bugs.launchpad.net/ironic-python-agent/+bug/1629133 - seems like they've merged a fix, we should try removing the work around: - https://review.openstack.org/#/c/381965/ Gate improvements (jlvillal, lucasagomes, dtantsur) === * trello: https://trello.com/c/HWVHhxOj/1-multi-tenant-networking-network-isolation - ironic-lib was successfully moved to xenial, and finally has some good coverage - I've proposed a plan on our CI jobs refactoring: http://lists.openstack.org/pipermail/openstack-dev/2016-October/105558.html Generic boot-from-volume (TheJulia, dtantsur, lucasagomes) == * trello: https://trello.com/c/UttNjDB7/13-generic-boot-from-volume - Specification in need of reviews: https://review.openstack.org/#/c/294995/ Driver composition (dtantsur) = * trello: https://trello.com/c/fTya14y6/14-driver-composition - dtantsur has reviewed the spec update. lgtm, but needs clarification a bit. Notifications (mariojv) === * trello: https://trello.com/c/MD8HNcwJ/17-notifications - Power state change notifications currently merging: https://review.openstack.org/#/c/321865/ - Spec for CRUD notifications, provision state change notifications, and maintenance notifications landed: https://review.openstack.org/#/c/347242/ - Code still needing review - https://review.openstack.org/#/c/348437/ - https://review.openstack.org/#/c/356541/ (currently workflow -1) Serial console (yossy, hshiina, yuikotakadamori) * trello: https://trello.com/c/nm3I8djr/20-serial-console - nova patch merged last Friday: https://review.openstack.org/#/c/328157/ - DONE \o/ Enhanced root device hints (lucasagomes) * trello: https://trello.com/c/f9DTEvDB/21-enhanced-root-device-hints - The code is merged in all involved projects: Ironic, IPA and ironic-lib - Missing documentation: https://review.openstack.org/#/c/386714/ Inspector (dtansur) === * trello: https://trello.com/c/PwH1pexJ/23-rescue-mode - no updates, hacking on :) ++ lol Bifrost (TheJulia) == - Keystone support and support for enabling inspector to enroll discovered hardware currently in review. . Until the week after the summit, --ruby [0] https://etherpad.openstack.org/p/IronicWhiteBoard __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev