The state of the ironic universe This month we're trying a new format to keep those interested updated on what is going on in ironic. The intent is for our weekly updates to now take the form of a monthly newsletter to cover highlights of what is going on in the ironic community. If you have something to add in the future, please feel free to reach out and we can add it to the next edition.
News! ===== - Long deprecated 'classic' ('pxe_*', 'agent_*) drivers are being removed. They will not be present in the next major version of ironic. - Ironic now has support to return nodes from maintenance state when BMC connectivity is restored from an outage. - BIOS configuration caching and setting assertion interface has merged and vendors are working on their implementations. From OpenInfra Days China! -------------------------- * Users in china are interested in ironic! * Everything from small hundreds to thousands, basic OS installation to super computing use cases! * The larger deployments are encountering some of the scale issues larger operators have experienced in the past. * The language barrier is making it difficult to grasp the finer details of: Deployment error reporting/troubleshooting and high availability mechanics. * Some operators are interested in the ability to "clone" or "backup" an ironic node's contents in order to redeploy elsewhere and/or restore the machine state. * Many operators wishing to contribute felt that they were unable to because "we are not [a] big name", that they would be unable to gain traction or build consensus by not being a major contributor already. In these discussions, I stressed that we all have similar, if not the same, problems that we are trying to solve. Julia wrote a recent SuperUser post about this.[1] From the OpenStack Summit ------------------------- Operator interests vary, but there are some common problems that operators have or are interested in solving. Attestation/Security Integration ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Some operators and deployers seek to strengthen their security posture via the use of TPMs, registration and attestation of status with attestation servers. In a sense, think of it as profile enforcement of bare metal. An initial specification [2] has been posted to try and figure out some of the details and potential integration points. Firmware Management (Version Discovery/Assertion) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Operators are seeking more capabilities to discover the current version of firmware on a particular bare metal node and then possibly take corrective action through ironic in order to update the firmware. The developer community does not presently have a plan to tackle this challenge, however doing so moves us closer to being a sort of attestation service. RAID prior to deploy and Software RAID ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ One of the frequent asks is for support for enabling the assertion of RAID configuration prior to deployment. Naturally this is somewhat problematic as this task CAN greatly extend the deployment time. Presently deployment steps[6] are anticipated to enable these sorts of workflows. Additionally the ask for Software RAID support seems to be ramping up. This is not a simple task for us to achieve, but conceivably it might take the same shape as hardware raid presently does, just with appropriate interface mechanisms during the deployment process. There are several conundrums, and the community needs to better understand desired cases before development of a solution can take place. Serial Console Logging ~~~~~~~~~~~~~~~~~~~~~~ Numerous operators expressed interest in having console logging support. This last seems to have been worked on last year[3] and likely needs a contributor to pick back up and champion it forward. Hardware CMDB/Asset Discovery/Recording and Tracking ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ While not directly a problem of deploying bare metal directly, information about hardware is needed to tie in tasks such as repairs, warranty, tax accounting, and so on and so forth. Often these problems becomes "solved" by disparate processes tracking information in several different places. There is a growing ask for something in the community to aid in this effort. Jokingly, we've already kind of come up with a name, but the current main ironic developer community doesn't have time to take on this challenge. The most viable path forward for interested operators is likely to detail the requirements and begin working together to implement something with integration with ironic. Rack Awareness/Conductor Locality ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Ironic is working on conductor locality, in terms of pinning specific bare metal nodes to specific ironic conductors. We hope that this functionality will be available in the final Rocky release.[4] Burn-in Tests ~~~~~~~~~~~~~ Operators expressed interest in having the capability to use ironic as part of burn-in proceses for hardware being added to the deployment. The developer community discussed implementing such tooling at the Rocky PTG and those discussions seemed to center around this being a clean step to perform some unknown actions on the ramdisk. The missing piece of the puzzle would be creating a "meta" step, and then executing additional steps. We mainly need to understand what would be good steps to implement in terms of actual actions to take for burning-in the node. Issues reported at the Summit ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ L3 Networking Documentation ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Operators expressed a need for improved documentation in L3/multi-tenant networking integration. This is something the active developer community is attempting to improve as time permits. Mutlitenant networking + boot from volume without HBAs ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ An increasing desire seems to exist to operate boot from volume with Multi-tenant networking, although without direct storage attachment on to that network, the routers need to take on the networking load for the IO operations. As such, this is something that we never anticipated during development of the feature. The community needs more information to better understand the operational scenario. Recent New Specifications ========================= * L3 based ironic deployments[7] This work aims to allow operators to deploy utilizing virtual media in remote data centers, where no DHCP is present. * Boot from Ramdisk[8] This is an often requested feature from the Scientific computing community, and may allow us to better support other types of ramdisk based booting, such as root on NFS and root on RBD. * Security Interface[9] There is a growing desire for support for integration into security frameworks, ultimately to enable better use of TPMs and/or enable tighter operator specific workflow integrations. This would benefit from operator feedback. * Synchronize events with neutron[10] This describes introduction of processes to enable ironic to better synchronize its actions with neutron. * Direct Deploy with local HTTP server[11] This is an feature that would allow operators to utilize the "direct" deployment interface with a local HTTP server instead of glance being backed by swift and using swift tempurls. Recently merged specifications ============================== * VNC Graphical Console [5] * Conductor/Node locality [4] Things that might make good Summit or conference talks ====================================================== * Talks about experiences scaling ironic or running ironic at scale. * Experiences about customizing drivers or hardware types. * New use cases! [1]: http://superuser.openstack.org/articles/translating-context-understanding-the-global-open-source-community/ [2]: https://review.openstack.org/576718 [3]: https://review.openstack.org/#/c/453627 [4]: https://review.openstack.org/#/c/559420 [5]: https://review.openstack.org/306074 [6]: https://review.openstack.org/#/c/549493/ [7]: https://review.openstack.org/543936 [8]: https://review.openstack.org/576717 [9]: https://review.openstack.org/576718 [10]: https://review.openstack.org/343684 [11]: https://review.openstack.org/#/c/504039/ __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev