Re: [OpenStack-Infra] Selecting New Priority Effort(s)
On Tue, Apr 3, 2018, at 7:33 PM, David Moreau Simard wrote: > It won't be very exciting but we really need to do one of the > following two things soon: > > 1) Ansiblify control plane [1] > 2) Update our puppet things to puppet 4 (or 5?) > > Puppet 3 has been end of life since Dec 31, 2016. [2] > > The longer we draw this out, the more work it'll be :( > > [1]: https://review.openstack.org/#/c/469983/ > [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w This is an excellent point, thank you for bringing this up. I think I've largely decided that the modernization of our control plane deployment should be our next priority effort. During the infra meeting there didn't seem to be any disagreement on that front. Now would be a good time to raise concerns if there are more pressing items we should be addressing, but I think I'm personally operating under the assumption this is it unless others speak up. Because David is right, this will only get worse as time goes on and we need to address it. The process I've proposed for actually making progress on this front is to update the existing specs for ansibilifying the control plane and performing a puppet upgrade, and Monty has volunteered to write a new spec to cover how we might containerize the control plane. Paul has volunteered to update the ansible spec and we need a volunteer to update the puppet spec (and maybe if we don't get a volunteer for that that itself is important information?). This way we can consider the options available to us side by side before making a major decision like this. Hopefully we can get that done by next week and we can start to do some serious review of the options here. Thank you, Clark ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
Re: [OpenStack-Infra] Selecting New Priority Effort(s)
On 4 April 2018 at 03:33, David Moreau Simard wrote: > It won't be very exciting but we really need to do one of the > following two things soon: > > 1) Ansiblify control plane [1] > 2) Update our puppet things to puppet 4 (or 5?) > > Puppet 3 has been end of life since Dec 31, 2016. [2] > > The longer we draw this out, the more work it'll be :( > > [1]: https://review.openstack.org/#/c/469983/ > [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w > > > David Moreau Simard > Senior Software Engineer | OpenStack RDO > > dmsimard = [irc, github, twitter] > > I would suggest that whether it's decided to switch to ansible for the control plane or update puppet modules, it will be well worth investing thought into performance when running across nodes that contain "different" services to perform "different functions". Ansible is very very good at running the same task across multiple machines, e.g. configuring homogeneous servers. But control planes have a tendency to have a lot of different services running on subsets, and this has a consequence of resulting in lots of time spent waiting on tasks to complete on some nodes and skip on the rest due to the synchronization of tasks across the entire set. When working on the precursor to https://github.com/ArdanaCLM (original was used as part of Helion OpenStack by HP(E)) we had a CI job testing the deployment of a small control plane and some services on a set of 6 VMs and the time cost was prohibitive at 1.5hrs ~ 2.5hrs (upgrade testing CI was double these figures). A lot of the time 50% or more of VMs were idle because tasks that involved a few nodes meant nothing else could be done on the others. There were some thoughts around adding a strategy plugin to ansible that could do a cross between the free-run and synchronized behaviour where you could free run to completion on nodes unless you encountered certain tasks. Other alternatives included nest ansible runs to have free runs done to a point before then performing the tasks that involved cluster style operations in synchronization or careful crafting of the playbooks to achieve the same. Never got around to solving these, and some of the problems were caused by us adopting an approach without necessarily having a deep understanding of the tooling. None of this is to say the same problems will exist here, but when you are managing systems/services that interact, and it's difficult to CI them in isolation at the project, potentially you'll want some way for developers and CI on changes to exercise a test env. The cost of developing/testing/integrating with either approach should probably be investigated for both in detail. Before you look at whether it's easy to replace the puppet modules with ansible or update to puppet 4/5, so it might be worth focusing on what approaches might be needed to extract the best experience first (stability, ease of writing/maintenance & speed of dev-env bring up come to mind as important)? Past experience with any config management suggests that when you start simple it's easy to incrementally improve on the existing approach, but reserving direction when you hit dead ends is almost impossible ;) -- Darragh Bailey "Nothing is foolproof to a sufficiently talented fool" ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
Re: [OpenStack-Infra] Selecting New Priority Effort(s)
On 04/06/2018 11:37 PM, Jens Harbott wrote: I didn't intend to say that this was easier. My comment was related to the efforts in https://review.openstack.org/558991 , which could be avoided if we decided to deploy askbot on Xenial with Ansible. The amount of work needed to perform the latter task would not change, but we could skip the intermediate step, assuming that we would start implementing 1) now instead of deciding to do it at a later stage. I disagree with this; having found a myriad of issues it's *still* simpler that re-writing the whole thing IMO. It doesn't matter, ansible, puppet, chef, bash scripts -- the underlying problem is that we choose support libraries for postgres, solr, celery, askbot, logs etc etc, get it to deploy, then forget about it until the next LTS release 2 years later. Of course the whole world has moved on, but we're pinned to old versions of everything and never tested on new platforms. What *would* have helped is a rspec test that even just simply applies the manifest on new platforms. We have great infrastructure for these tests; but most of our modules don't actually *run* anything (e.g., here's ethercalc and etherpad-lite issues too [1,2]). These make it so much easier to collaborate; we can all see the result of changes, link to logs, get input on what's going wrong, etc etc. -i [1] https://review.openstack.org/527822 [2] https://review.openstack.org/528130 ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
Re: [OpenStack-Infra] Selecting New Priority Effort(s)
2018-04-06 12:47 GMT+00:00 Colleen Murphy : > On Thu, Apr 5, 2018, at 4:57 PM, Jeremy Stanley wrote: >> On 2018-04-05 14:35:27 + (+), Jens Harbott wrote: >> > 2018-04-04 2:33 GMT+00:00 David Moreau Simard : >> > > It won't be very exciting but we really need to do one of the >> > > following two things soon: >> > > >> > > 1) Ansiblify control plane [1] >> > > 2) Update our puppet things to puppet 4 (or 5?) >> > > >> > > Puppet 3 has been end of life since Dec 31, 2016. [2] >> > > >> > > The longer we draw this out, the more work it'll be :( >> > > >> > > [1]: https://review.openstack.org/#/c/469983/ >> > > [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w >> > >> > I agree and would vote for option 1), that would also seem to blend >> > well with upgrading to Xenial. Avoid having to invest much effort in >> > making puppet things work for Xenial, like we just discovered would be >> > needed for askbot. >> >> It's not immediately clear to me how rewriting numerous Puppet >> modules in Ansible avoids having to invest much effort... or is it >> the case that a lot of the things we're installing now have >> corresponding Ansible modules already? Has anyone skimmed through >> https://git.openstack.org/cgit/openstack-infra/system-config/tree/modules.env >> and figured out how many of those seem supported by the existing >> Ansible ecosystem vs how many we'd have to create ourselves? >> -- >> Jeremy Stanley > > The puppet modules are already tested with puppet-apply and beaker on Xenial. > There should be very little if any effort to ensure they work on Xenial. It > is a bit hard for me to imagine that a complete rewrite would be easier. I didn't intend to say that this was easier. My comment was related to the efforts in https://review.openstack.org/558991 , which could be avoided if we decided to deploy askbot on Xenial with Ansible. The amount of work needed to perform the latter task would not change, but we could skip the intermediate step, assuming that we would start implementing 1) now instead of deciding to do it at a later stage. ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
Re: [OpenStack-Infra] Selecting New Priority Effort(s)
On Thu, Apr 5, 2018, at 4:57 PM, Jeremy Stanley wrote: > On 2018-04-05 14:35:27 + (+), Jens Harbott wrote: > > 2018-04-04 2:33 GMT+00:00 David Moreau Simard : > > > It won't be very exciting but we really need to do one of the > > > following two things soon: > > > > > > 1) Ansiblify control plane [1] > > > 2) Update our puppet things to puppet 4 (or 5?) > > > > > > Puppet 3 has been end of life since Dec 31, 2016. [2] > > > > > > The longer we draw this out, the more work it'll be :( > > > > > > [1]: https://review.openstack.org/#/c/469983/ > > > [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w > > > > I agree and would vote for option 1), that would also seem to blend > > well with upgrading to Xenial. Avoid having to invest much effort in > > making puppet things work for Xenial, like we just discovered would be > > needed for askbot. > > It's not immediately clear to me how rewriting numerous Puppet > modules in Ansible avoids having to invest much effort... or is it > the case that a lot of the things we're installing now have > corresponding Ansible modules already? Has anyone skimmed through > https://git.openstack.org/cgit/openstack-infra/system-config/tree/modules.env > and figured out how many of those seem supported by the existing > Ansible ecosystem vs how many we'd have to create ourselves? > -- > Jeremy Stanley The puppet modules are already tested with puppet-apply and beaker on Xenial. There should be very little if any effort to ensure they work on Xenial. It is a bit hard for me to imagine that a complete rewrite would be easier. Colleen ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
Re: [OpenStack-Infra] Selecting New Priority Effort(s)
On 2018-04-05 14:35:27 + (+), Jens Harbott wrote: > 2018-04-04 2:33 GMT+00:00 David Moreau Simard : > > It won't be very exciting but we really need to do one of the > > following two things soon: > > > > 1) Ansiblify control plane [1] > > 2) Update our puppet things to puppet 4 (or 5?) > > > > Puppet 3 has been end of life since Dec 31, 2016. [2] > > > > The longer we draw this out, the more work it'll be :( > > > > [1]: https://review.openstack.org/#/c/469983/ > > [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w > > I agree and would vote for option 1), that would also seem to blend > well with upgrading to Xenial. Avoid having to invest much effort in > making puppet things work for Xenial, like we just discovered would be > needed for askbot. It's not immediately clear to me how rewriting numerous Puppet modules in Ansible avoids having to invest much effort... or is it the case that a lot of the things we're installing now have corresponding Ansible modules already? Has anyone skimmed through https://git.openstack.org/cgit/openstack-infra/system-config/tree/modules.env and figured out how many of those seem supported by the existing Ansible ecosystem vs how many we'd have to create ourselves? -- Jeremy Stanley signature.asc Description: PGP signature ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
Re: [OpenStack-Infra] Selecting New Priority Effort(s)
2018-04-04 2:33 GMT+00:00 David Moreau Simard : > It won't be very exciting but we really need to do one of the > following two things soon: > > 1) Ansiblify control plane [1] > 2) Update our puppet things to puppet 4 (or 5?) > > Puppet 3 has been end of life since Dec 31, 2016. [2] > > The longer we draw this out, the more work it'll be :( > > [1]: https://review.openstack.org/#/c/469983/ > [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w I agree and would vote for option 1), that would also seem to blend well with upgrading to Xenial. Avoid having to invest much effort in making puppet things work for Xenial, like we just discovered would be needed for askbot. ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
Re: [OpenStack-Infra] Selecting New Priority Effort(s)
It won't be very exciting but we really need to do one of the following two things soon: 1) Ansiblify control plane [1] 2) Update our puppet things to puppet 4 (or 5?) Puppet 3 has been end of life since Dec 31, 2016. [2] The longer we draw this out, the more work it'll be :( [1]: https://review.openstack.org/#/c/469983/ [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w David Moreau Simard Senior Software Engineer | OpenStack RDO dmsimard = [irc, github, twitter] On Tue, Apr 3, 2018 at 4:23 PM, Clark Boylan wrote: > Hello everyone, > > I just approved the change to mark the Zuul v3 priority effort as completed > in the infra-specs repo. Thank you to everyone that made that possible. With > Zuul v3 work largely done we can now look forward to our next priority > efforts. > > Currently the only task marked as a priority is the task-tracker spec which > at this point is migrating projects into storyboard. I think we can likely > add one or two new priority efforts to this list. > > After some quick initial brainstorming these were the ideas I had for getting > onto that list (note some may require we actually write a spec): > > * Gerrit upgrade to 2.14/2.15 > * Control Plane operating system upgrades to Xenial > * Bringing wiki under config management management > > My bias here is I've personally been working to try and pay down some of this > tech debt we've built up simply due to bit rot, but I know we have other > specs and I'm sure we can make good arguments for why other efforts should be > made a priority. I'd love to get feedback on what others think would make > good priority efforts. > > Let's use this thread to identify candidates then whittle the list down to > one or two to focus on for the next little while. > > Thank you, > Clark > > ___ > OpenStack-Infra mailing list > OpenStack-Infra@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
[OpenStack-Infra] Selecting New Priority Effort(s)
Hello everyone, I just approved the change to mark the Zuul v3 priority effort as completed in the infra-specs repo. Thank you to everyone that made that possible. With Zuul v3 work largely done we can now look forward to our next priority efforts. Currently the only task marked as a priority is the task-tracker spec which at this point is migrating projects into storyboard. I think we can likely add one or two new priority efforts to this list. After some quick initial brainstorming these were the ideas I had for getting onto that list (note some may require we actually write a spec): * Gerrit upgrade to 2.14/2.15 * Control Plane operating system upgrades to Xenial * Bringing wiki under config management management My bias here is I've personally been working to try and pay down some of this tech debt we've built up simply due to bit rot, but I know we have other specs and I'm sure we can make good arguments for why other efforts should be made a priority. I'd love to get feedback on what others think would make good priority efforts. Let's use this thread to identify candidates then whittle the list down to one or two to focus on for the next little while. Thank you, Clark ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra