Re: [OpenStack-Infra] Selecting New Priority Effort(s)

2018-04-10 Thread Clark Boylan
On Tue, Apr 3, 2018, at 7:33 PM, David Moreau Simard wrote:
> It won't be very exciting but we really need to do one of the
> following two things soon:
> 
> 1) Ansiblify control plane [1]
> 2) Update our puppet things to puppet 4 (or 5?)
> 
> Puppet 3 has been end of life since Dec 31, 2016. [2]
> 
> The longer we draw this out, the more work it'll be :(
> 
> [1]: https://review.openstack.org/#/c/469983/
> [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w

This is an excellent point, thank you for bringing this up. I think I've 
largely decided that the modernization of our control plane deployment should 
be our next priority effort. During the infra meeting there didn't seem to be 
any disagreement on that front. Now would be a good time to raise concerns if 
there are more pressing items we should be addressing, but I think I'm 
personally operating under the assumption this is it unless others speak up. 
Because David is right, this will only get worse as time goes on and we need to 
address it.

The process I've proposed for actually making progress on this front is to 
update the existing specs for ansibilifying the control plane and performing a 
puppet upgrade, and Monty has volunteered to write a new spec to cover how we 
might containerize the control plane. Paul has volunteered to update the 
ansible spec and we need a volunteer to update the puppet spec (and maybe if we 
don't get a volunteer for that that itself is important information?).

This way we can consider the options available to us side by side before making 
a major decision like this. Hopefully we can get that done by next week and we 
can start to do some serious review of the options here.

Thank you,
Clark

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Re: [OpenStack-Infra] Selecting New Priority Effort(s)

2018-04-10 Thread Darragh Bailey
On 4 April 2018 at 03:33, David Moreau Simard  wrote:

> It won't be very exciting but we really need to do one of the
> following two things soon:
>
> 1) Ansiblify control plane [1]
> 2) Update our puppet things to puppet 4 (or 5?)
>
> Puppet 3 has been end of life since Dec 31, 2016. [2]
>
> The longer we draw this out, the more work it'll be :(
>
> [1]: https://review.openstack.org/#/c/469983/
> [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w
>
>
> David Moreau Simard
> Senior Software Engineer | OpenStack RDO
>
> dmsimard = [irc, github, twitter]
>
>

I would suggest that whether it's decided to switch to ansible for the
control plane or update puppet modules, it will be well worth investing
thought into performance when running across nodes that contain "different"
services to perform "different functions".

Ansible is very very good at running the same task across multiple
machines, e.g. configuring homogeneous servers. But control planes have a
tendency to have a lot of different services running on subsets, and this
has a consequence of resulting in lots of time spent waiting on tasks to
complete on some nodes and skip on the rest due to the synchronization of
tasks across the entire set.

When working on the precursor to https://github.com/ArdanaCLM (original was
used as part of Helion OpenStack by HP(E)) we had a CI job testing the
deployment of a small control plane and some services on a set of 6 VMs and
the time cost was prohibitive at 1.5hrs ~ 2.5hrs (upgrade testing CI was
double these figures). A lot of the time 50% or more of VMs were idle
because tasks that involved a few nodes meant nothing else could be done on
the others.

There were some thoughts around adding a strategy plugin to ansible that
could do a cross between the free-run and synchronized behaviour where you
could free run to completion on nodes unless you encountered certain tasks.
Other alternatives included nest ansible runs to have free runs done to a
point before then performing the tasks that involved cluster style
operations in synchronization or careful crafting of the playbooks to
achieve the same. Never got around to solving these, and some of the
problems were caused by us adopting an approach without necessarily having
a deep understanding of the tooling.


None of this is to say the same problems will exist here, but when you are
managing systems/services that interact, and it's difficult to CI them in
isolation at the project, potentially you'll want some way for developers
and CI on changes to exercise a test env.

The cost of developing/testing/integrating with either approach should
probably be investigated for both in detail. Before you look at whether
it's easy to replace the puppet modules with ansible or update to puppet
4/5, so it might be worth focusing on what approaches might be needed to
extract the best experience first (stability, ease of writing/maintenance &
speed of dev-env bring up come to mind as important)?

Past experience with any config management suggests that when you start
simple it's easy to incrementally improve on the existing approach, but
reserving direction when you hit dead ends is almost impossible ;)

-- 
Darragh Bailey
"Nothing is foolproof to a sufficiently talented fool"
___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Re: [OpenStack-Infra] Selecting New Priority Effort(s)

2018-04-09 Thread Ian Wienand

On 04/06/2018 11:37 PM, Jens Harbott wrote:

I didn't intend to say that this was easier. My comment was related
to the efforts in https://review.openstack.org/558991 , which could
be avoided if we decided to deploy askbot on Xenial with
Ansible. The amount of work needed to perform the latter task would
not change, but we could skip the intermediate step, assuming that
we would start implementing 1) now instead of deciding to do it at a
later stage.


I disagree with this; having found a myriad of issues it's *still*
simpler that re-writing the whole thing IMO.

It doesn't matter, ansible, puppet, chef, bash scripts -- the
underlying problem is that we choose support libraries for postgres,
solr, celery, askbot, logs etc etc, get it to deploy, then forget
about it until the next LTS release 2 years later.  Of course the
whole world has moved on, but we're pinned to old versions of
everything and never tested on new platforms.

What *would* have helped is a rspec test that even just simply applies
the manifest on new platforms.  We have great infrastructure for these
tests; but most of our modules don't actually *run* anything (e.g.,
here's ethercalc and etherpad-lite issues too [1,2]).

These make it so much easier to collaborate; we can all see the result
of changes, link to logs, get input on what's going wrong, etc etc.

-i

[1] https://review.openstack.org/527822
[2] https://review.openstack.org/528130

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Re: [OpenStack-Infra] Selecting New Priority Effort(s)

2018-04-06 Thread Jens Harbott
2018-04-06 12:47 GMT+00:00 Colleen Murphy :
> On Thu, Apr 5, 2018, at 4:57 PM, Jeremy Stanley wrote:
>> On 2018-04-05 14:35:27 + (+), Jens Harbott wrote:
>> > 2018-04-04 2:33 GMT+00:00 David Moreau Simard :
>> > > It won't be very exciting but we really need to do one of the
>> > > following two things soon:
>> > >
>> > > 1) Ansiblify control plane [1]
>> > > 2) Update our puppet things to puppet 4 (or 5?)
>> > >
>> > > Puppet 3 has been end of life since Dec 31, 2016. [2]
>> > >
>> > > The longer we draw this out, the more work it'll be :(
>> > >
>> > > [1]: https://review.openstack.org/#/c/469983/
>> > > [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w
>> >
>> > I agree and would vote for option 1), that would also seem to blend
>> > well with upgrading to Xenial. Avoid having to invest much effort in
>> > making puppet things work for Xenial, like we just discovered would be
>> > needed for askbot.
>>
>> It's not immediately clear to me how rewriting numerous Puppet
>> modules in Ansible avoids having to invest much effort... or is it
>> the case that a lot of the things we're installing now have
>> corresponding Ansible modules already? Has anyone skimmed through
>> https://git.openstack.org/cgit/openstack-infra/system-config/tree/modules.env
>> and figured out how many of those seem supported by the existing
>> Ansible ecosystem vs how many we'd have to create ourselves?
>> --
>> Jeremy Stanley
>
> The puppet modules are already tested with puppet-apply and beaker on Xenial. 
> There should be very little if any effort to ensure they work on Xenial. It 
> is a bit hard for me to imagine that a complete rewrite would be easier.

I didn't intend to say that this was easier. My comment was related to
the efforts in https://review.openstack.org/558991 , which could be
avoided if we decided to deploy askbot on Xenial with Ansible. The
amount of work needed to perform the latter task would not change, but
we could skip the intermediate step, assuming that we would start
implementing 1) now instead of deciding to do it at a later stage.

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Re: [OpenStack-Infra] Selecting New Priority Effort(s)

2018-04-06 Thread Colleen Murphy
On Thu, Apr 5, 2018, at 4:57 PM, Jeremy Stanley wrote:
> On 2018-04-05 14:35:27 + (+), Jens Harbott wrote:
> > 2018-04-04 2:33 GMT+00:00 David Moreau Simard :
> > > It won't be very exciting but we really need to do one of the
> > > following two things soon:
> > >
> > > 1) Ansiblify control plane [1]
> > > 2) Update our puppet things to puppet 4 (or 5?)
> > >
> > > Puppet 3 has been end of life since Dec 31, 2016. [2]
> > >
> > > The longer we draw this out, the more work it'll be :(
> > >
> > > [1]: https://review.openstack.org/#/c/469983/
> > > [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w
> > 
> > I agree and would vote for option 1), that would also seem to blend
> > well with upgrading to Xenial. Avoid having to invest much effort in
> > making puppet things work for Xenial, like we just discovered would be
> > needed for askbot.
> 
> It's not immediately clear to me how rewriting numerous Puppet
> modules in Ansible avoids having to invest much effort... or is it
> the case that a lot of the things we're installing now have
> corresponding Ansible modules already? Has anyone skimmed through
> https://git.openstack.org/cgit/openstack-infra/system-config/tree/modules.env
> and figured out how many of those seem supported by the existing
> Ansible ecosystem vs how many we'd have to create ourselves?
> -- 
> Jeremy Stanley

The puppet modules are already tested with puppet-apply and beaker on Xenial. 
There should be very little if any effort to ensure they work on Xenial. It is 
a bit hard for me to imagine that a complete rewrite would be easier.

Colleen

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Re: [OpenStack-Infra] Selecting New Priority Effort(s)

2018-04-05 Thread Jeremy Stanley
On 2018-04-05 14:35:27 + (+), Jens Harbott wrote:
> 2018-04-04 2:33 GMT+00:00 David Moreau Simard :
> > It won't be very exciting but we really need to do one of the
> > following two things soon:
> >
> > 1) Ansiblify control plane [1]
> > 2) Update our puppet things to puppet 4 (or 5?)
> >
> > Puppet 3 has been end of life since Dec 31, 2016. [2]
> >
> > The longer we draw this out, the more work it'll be :(
> >
> > [1]: https://review.openstack.org/#/c/469983/
> > [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w
> 
> I agree and would vote for option 1), that would also seem to blend
> well with upgrading to Xenial. Avoid having to invest much effort in
> making puppet things work for Xenial, like we just discovered would be
> needed for askbot.

It's not immediately clear to me how rewriting numerous Puppet
modules in Ansible avoids having to invest much effort... or is it
the case that a lot of the things we're installing now have
corresponding Ansible modules already? Has anyone skimmed through
https://git.openstack.org/cgit/openstack-infra/system-config/tree/modules.env
and figured out how many of those seem supported by the existing
Ansible ecosystem vs how many we'd have to create ourselves?
-- 
Jeremy Stanley


signature.asc
Description: PGP signature
___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Re: [OpenStack-Infra] Selecting New Priority Effort(s)

2018-04-05 Thread Jens Harbott
2018-04-04 2:33 GMT+00:00 David Moreau Simard :
> It won't be very exciting but we really need to do one of the
> following two things soon:
>
> 1) Ansiblify control plane [1]
> 2) Update our puppet things to puppet 4 (or 5?)
>
> Puppet 3 has been end of life since Dec 31, 2016. [2]
>
> The longer we draw this out, the more work it'll be :(
>
> [1]: https://review.openstack.org/#/c/469983/
> [2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w

I agree and would vote for option 1), that would also seem to blend
well with upgrading to Xenial. Avoid having to invest much effort in
making puppet things work for Xenial, like we just discovered would be
needed for askbot.

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Re: [OpenStack-Infra] Selecting New Priority Effort(s)

2018-04-03 Thread David Moreau Simard
It won't be very exciting but we really need to do one of the
following two things soon:

1) Ansiblify control plane [1]
2) Update our puppet things to puppet 4 (or 5?)

Puppet 3 has been end of life since Dec 31, 2016. [2]

The longer we draw this out, the more work it'll be :(

[1]: https://review.openstack.org/#/c/469983/
[2]: https://groups.google.com/forum/#!topic/puppet-users/IdutL5FTW7w


David Moreau Simard
Senior Software Engineer | OpenStack RDO

dmsimard = [irc, github, twitter]


On Tue, Apr 3, 2018 at 4:23 PM, Clark Boylan  wrote:
> Hello everyone,
>
> I just approved the change to mark the Zuul v3 priority effort as completed 
> in the infra-specs repo. Thank you to everyone that made that possible. With 
> Zuul v3 work largely done we can now look forward to our next priority 
> efforts.
>
> Currently the only task marked as a priority is the task-tracker spec which 
> at this point is migrating projects into storyboard. I think we can likely 
> add one or two new priority efforts to this list.
>
> After some quick initial brainstorming these were the ideas I had for getting 
> onto that list (note some may require we actually write a spec):
>
> * Gerrit upgrade to 2.14/2.15
> * Control Plane operating system upgrades to Xenial
> * Bringing wiki under config management management
>
> My bias here is I've personally been working to try and pay down some of this 
> tech debt we've built up simply due to bit rot, but I know we have other 
> specs and I'm sure we can make good arguments for why other efforts should be 
> made a priority. I'd love to get feedback on what others think would make 
> good priority efforts.
>
> Let's use this thread to identify candidates then whittle the list down to 
> one or two to focus on for the next little while.
>
> Thank you,
> Clark
>
> ___
> OpenStack-Infra mailing list
> OpenStack-Infra@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

[OpenStack-Infra] Selecting New Priority Effort(s)

2018-04-03 Thread Clark Boylan
Hello everyone,

I just approved the change to mark the Zuul v3 priority effort as completed in 
the infra-specs repo. Thank you to everyone that made that possible. With Zuul 
v3 work largely done we can now look forward to our next priority efforts.

Currently the only task marked as a priority is the task-tracker spec which at 
this point is migrating projects into storyboard. I think we can likely add one 
or two new priority efforts to this list.

After some quick initial brainstorming these were the ideas I had for getting 
onto that list (note some may require we actually write a spec):

* Gerrit upgrade to 2.14/2.15
* Control Plane operating system upgrades to Xenial
* Bringing wiki under config management management

My bias here is I've personally been working to try and pay down some of this 
tech debt we've built up simply due to bit rot, but I know we have other specs 
and I'm sure we can make good arguments for why other efforts should be made a 
priority. I'd love to get feedback on what others think would make good 
priority efforts.

Let's use this thread to identify candidates then whittle the list down to one 
or two to focus on for the next little while.

Thank you,
Clark

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra