Re: [OpenStack-Infra] Work toward a translations checksite and call for help

2016-08-02 Thread Elizabeth K. Joseph
On Mon, Aug 1, 2016 at 12:12 PM, Jeremy Stanley  wrote:
> I'm hesitant to rely on unstack/clean/stack working consistently
> over time, though maybe others have seen them behave more reliably
> than I think they do. I had assumed we'd replace with fresh servers
> each time and bootstrap DevStack from scratch, though perhaps that's
> overkill?

This is what I was assuming as well, since we'd need a fresh version
of DevStack itself each time so the latest translations cleanly apply.
It would be hard to track all the changes by just doing
unstack/clean/fresh DevStack clone/stack, even if it was reliable over
time (my experience has also been that it's not).

I also learned the other day that the rejoin_stack.sh script was
largely unmaintained and removed in Mitaka, so any reboots cause you
to have to run unstack/clean/stack again, which is worthy to consider
as we discuss snapshots.

-- 
Elizabeth Krumbach Joseph || Lyz || pleia2

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra


Re: [OpenStack-Infra] Work toward a translations checksite and call for help

2016-08-01 Thread Frank Kloeker

Am 2016-07-25 19:05, schrieb Jeremy Stanley:

On 2016-07-25 11:08:35 +0530 (+0530), Vipul Nayyar wrote:

Honestly, I was also thinking that using containers for implementing
blue/green deployment would be best for implementing minimal downtime. 
I
suggest having a basic run-through of this idea with the community 
over

tomorrow's irc meeting should be a good start.


Waving containers at the problem doesn't really solve the
fundamental issue at hand (we could just as easily use DNS or an
Apache redirect to switch between virtual machines, possibly more
easily since we already have existing mechanisms for deploying and
replacing virtual machines). The issue that needs addressing first,
I think, is how to get new DevStack deployments from master branch
tip of all projects to work consistently at each rebuild interval
or, more likely, to design a pattern that avoids replacing a working
deployment with a broken one along with some means to find out that
redeployment is failing so that it can effectively be troubleshot
post-mortem.


Hi Jeremy,

broken DevStack installation - that's the point. With LXD container you 
can take snapshot, run unstack or clean script, fetch new code and stack 
again. If it failed you can restore the snapshot and try new 
installation on another day. Without snapshot you can start new 
container with new code and shutdown the old one. So I like the idea 
with haproxy in front but wouldn't change any DNS entries because it 
takes time for end-users.
If you have enough resources then we can work with 3 VMs: 2 DevStack 
installation with translation check-site and one with haproxy hosting 
the public FQDN and a kind of trigger to refresh the installation on the 
DevStack VM _if_ the other VM is up. If the other DevStack service is 
down, the trigger should try an unstack/clean/stack after one day and 
switch over if the service is up. This could be done with lb-update 
(https://www.haproxy.com/doc/aloha/7.0/haproxy/lbupdate.html) or haproxy 
API.

The process should have a small monitoring about the status.

kind regards

Frank


___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra


Re: [OpenStack-Infra] Work toward a translations checksite and call for help

2016-08-01 Thread Ricardo Carrillo Cruz
Oh, hahaha, I thought the dns.py was actually doing something.
Now that I see the script I know what you mean :-).

2016-08-01 17:10 GMT+02:00 Jeremy Stanley :

> On 2016-08-01 16:46:07 +0200 (+0200), Ricardo Carrillo Cruz wrote:
> > In my mind, I thought set_dns would be really an ansible wrapper to
> > system-config launch/dns.py script.
> [...]
>
> There's a reason why that script only tells you what commands to
> run, and doesn't run them for you. At least that way we can still
> assert that we're not writing automation to communicate with
> Rackspace's (proprietary, non-free, nonstandard, non-OpenStack) DNS
> API if a sysadmin has to manually run commands to update records
> through it. Then it's no worse on a philosophical level than using a
> Web browser to make DNS changes through their similarly proprietary
> dashboard site.
> --
> Jeremy Stanley
>
> ___
> OpenStack-Infra mailing list
> OpenStack-Infra@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>
___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra


Re: [OpenStack-Infra] Work toward a translations checksite and call for help

2016-08-01 Thread Jeremy Stanley
On 2016-08-01 16:46:07 +0200 (+0200), Ricardo Carrillo Cruz wrote:
> In my mind, I thought set_dns would be really an ansible wrapper to
> system-config launch/dns.py script.
[...]

There's a reason why that script only tells you what commands to
run, and doesn't run them for you. At least that way we can still
assert that we're not writing automation to communicate with
Rackspace's (proprietary, non-free, nonstandard, non-OpenStack) DNS
API if a sysadmin has to manually run commands to update records
through it. Then it's no worse on a philosophical level than using a
Web browser to make DNS changes through their similarly proprietary
dashboard site.
-- 
Jeremy Stanley

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra


Re: [OpenStack-Infra] Work toward a translations checksite and call for help

2016-08-01 Thread Ricardo Carrillo Cruz
In my mind, I thought set_dns would be really an ansible wrapper to
system-config launch/dns.py script.

But yeah, putting the switch on what's Devstack latest and what's not on an
Apache reverse proxy works too.
The workflow would be similar to what I depicted.

I think the biggest issue is that DevStack really gives a lot of problems
when you try to stack/unstack , so long-lived
servers are asking for trouble here.

2016-08-01 16:36 GMT+02:00 Jeremy Stanley :

> On 2016-08-01 16:08:49 +0200 (+0200), Ricardo Carrillo Cruz wrote:
> [...]
> > The set DNS task would check a file on the puppetmaster which contains
> the
> > state of blue/green DNS records (translate-latest.openstack.org
> pointing to
> > translate_a and translate-soon-to-be-deleted.openstack.org pointing to
> > translate_b or viceversa) and would only run in case any of the preceding
> > create_server tasks did anything.
> [...]
>
> Problem is we can't (okay, shouldn't) automate DNS changes while
> we're relying on Rackspace's DNS service, since it's not using a
> standard OpenStack API and we really don't want to write additional
> tooling to it.
>
> As mentioned in my earlier E-mail, a simple alternative is to just
> update a HTTP 302 (temporary) redirect or a rewrite/proxy to the
> "live" deployment in an Apache vhost on static.openstack.org or
> perhaps update a persistent haproxy pool. Proxying rather than
> redirecting probably makes the most sense as we can avoid presenting
> IP-address-based URLs to the consumer (and if we're forced to deploy
> with TLS then we might be able to stabilize a solution for that at
> the proxy too).
> --
> Jeremy Stanley
>
> ___
> OpenStack-Infra mailing list
> OpenStack-Infra@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>
___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra


Re: [OpenStack-Infra] Work toward a translations checksite and call for help

2016-08-01 Thread Jeremy Stanley
On 2016-08-01 16:08:49 +0200 (+0200), Ricardo Carrillo Cruz wrote:
[...]
> The set DNS task would check a file on the puppetmaster which contains the
> state of blue/green DNS records (translate-latest.openstack.org pointing to
> translate_a and translate-soon-to-be-deleted.openstack.org pointing to
> translate_b or viceversa) and would only run in case any of the preceding
> create_server tasks did anything.
[...]

Problem is we can't (okay, shouldn't) automate DNS changes while
we're relying on Rackspace's DNS service, since it's not using a
standard OpenStack API and we really don't want to write additional
tooling to it.

As mentioned in my earlier E-mail, a simple alternative is to just
update a HTTP 302 (temporary) redirect or a rewrite/proxy to the
"live" deployment in an Apache vhost on static.openstack.org or
perhaps update a persistent haproxy pool. Proxying rather than
redirecting probably makes the most sense as we can avoid presenting
IP-address-based URLs to the consumer (and if we're forced to deploy
with TLS then we might be able to stabilize a solution for that at
the proxy too).
-- 
Jeremy Stanley

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra


Re: [OpenStack-Infra] Work toward a translations checksite and call for help

2016-08-01 Thread Ricardo Carrillo Cruz
How about something like a playbook that runs on puppetmaster periodically
doing something like this:

create_server_translate_a
create_server_translate_b
set_dns

The create_server_translate tasks would be idempotent, i.e. they won't leak
servers.
The set DNS task would check a file on the puppetmaster which contains the
state of blue/green DNS records (translate-latest.openstack.org pointing to
translate_a and translate-soon-to-be-deleted.openstack.org pointing to
translate_b or viceversa) and would only run in case any of the preceding
create_server tasks did anything.

Then at $DAYS, we have a cron task that deletes whatever server is blue (or
green, none of those colors are my favorites :-), swapping A is Blue/B is
green or viceversa.

The main play from above for recreating them would pick up and create a new
server and do the needful from DNS perspective.

Thoughts?

2016-07-25 19:05 GMT+02:00 Jeremy Stanley :

> On 2016-07-25 11:08:35 +0530 (+0530), Vipul Nayyar wrote:
> > Honestly, I was also thinking that using containers for implementing
> > blue/green deployment would be best for implementing minimal downtime. I
> > suggest having a basic run-through of this idea with the community over
> > tomorrow's irc meeting should be a good start.
>
> Waving containers at the problem doesn't really solve the
> fundamental issue at hand (we could just as easily use DNS or an
> Apache redirect to switch between virtual machines, possibly more
> easily since we already have existing mechanisms for deploying and
> replacing virtual machines). The issue that needs addressing first,
> I think, is how to get new DevStack deployments from master branch
> tip of all projects to work consistently at each rebuild interval
> or, more likely, to design a pattern that avoids replacing a working
> deployment with a broken one along with some means to find out that
> redeployment is failing so that it can effectively be troubleshot
> post-mortem.
> --
> Jeremy Stanley
>
> ___
> OpenStack-Infra mailing list
> OpenStack-Infra@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>
___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra


Re: [OpenStack-Infra] Work toward a translations checksite and call for help

2016-07-25 Thread Jeremy Stanley
On 2016-07-25 11:08:35 +0530 (+0530), Vipul Nayyar wrote:
> Honestly, I was also thinking that using containers for implementing
> blue/green deployment would be best for implementing minimal downtime. I
> suggest having a basic run-through of this idea with the community over
> tomorrow's irc meeting should be a good start.

Waving containers at the problem doesn't really solve the
fundamental issue at hand (we could just as easily use DNS or an
Apache redirect to switch between virtual machines, possibly more
easily since we already have existing mechanisms for deploying and
replacing virtual machines). The issue that needs addressing first,
I think, is how to get new DevStack deployments from master branch
tip of all projects to work consistently at each rebuild interval
or, more likely, to design a pattern that avoids replacing a working
deployment with a broken one along with some means to find out that
redeployment is failing so that it can effectively be troubleshot
post-mortem.
-- 
Jeremy Stanley

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra


Re: [OpenStack-Infra] Work toward a translations checksite and call for help

2016-07-24 Thread Vipul Nayyar
Hey Frank,

Honestly, I was also thinking that using containers for implementing
blue/green deployment would be best for implementing minimal downtime. I
suggest having a basic run-through of this idea with the community over
tomorrow's irc meeting should be a good start.

Regards
Vipul Nayyar


On Wed, Jul 20, 2016 at 8:15 PM, Frank Kloeker  wrote:

> Am 2016-07-11 14:59, schrieb Vipul Nayyar:
>
>> Hey Elizabeth,
>>
>> I'd like to contribute. :-)
>>
>> I have some past deployment and Ops experience and I'm really
>> interested in building something of a blue green deployment system
>> here, to decrease the downtime. Although, I'm still going through the
>> infra related docs which I'm fairly new to, but with a little bit of
>> guidance early on, I'll be happy to take over some responsibilities
>> over time.
>>
>> Maybe a good place for me to start might be, to have a deep look at
>> the puppet module written by Frank and probably noting down the most
>> common errors that are encountered regularly. I'd like to hear more
>> concrete thoughts from the community about how to proceed on this, if
>> any.
>>
>
> Welcome Vipul,
>
> no big prefaces, I'd like the idea with blue/green deployment because we
> have to bridge downtime when DevStack is re-installing, requirement is once
> a week (day). And we have to pick a way return if DevStack installation
> failed. The reason for this is more DevStack specific because we want to
> use master branch with the newest changes.
> I have gained some experience with LXD containter and want to push the
> topic a little bit forward. The draft of my idea is here:
> https://github.com/eumel8/translation_checksite/blob/container/translation_check_container.jpg
> There are 2 container with DevStack installation + translation checksite.
> In front of the container is some magic, called Watchdog for installing the
> stuff and guarding the installation. Traffic will be route to the last
> available container version. Container installation is a little bit
> described here:
> http://docs.openstack.org/developer/devstack/guides/lxc.html But needs to
> adapt for LXD 2.0.
> And we have to persuade the infra team to provide 16.04 VM :-)
> Let me know what do you think.
>
> kind regards
>
> Frank
>
>
___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra


Re: [OpenStack-Infra] Work toward a translations checksite and call for help

2016-07-20 Thread Frank Kloeker

Am 2016-07-11 14:59, schrieb Vipul Nayyar:

Hey Elizabeth,

I'd like to contribute. :-)

I have some past deployment and Ops experience and I'm really
interested in building something of a blue green deployment system
here, to decrease the downtime. Although, I'm still going through the
infra related docs which I'm fairly new to, but with a little bit of
guidance early on, I'll be happy to take over some responsibilities
over time.

Maybe a good place for me to start might be, to have a deep look at
the puppet module written by Frank and probably noting down the most
common errors that are encountered regularly. I'd like to hear more
concrete thoughts from the community about how to proceed on this, if
any.


Welcome Vipul,

no big prefaces, I'd like the idea with blue/green deployment because we 
have to bridge downtime when DevStack is re-installing, requirement is 
once a week (day). And we have to pick a way return if DevStack 
installation failed. The reason for this is more DevStack specific 
because we want to use master branch with the newest changes.
I have gained some experience with LXD containter and want to push the 
topic a little bit forward. The draft of my idea is here: 
https://github.com/eumel8/translation_checksite/blob/container/translation_check_container.jpg
There are 2 container with DevStack installation + translation 
checksite. In front of the container is some magic, called Watchdog for 
installing the stuff and guarding the installation. Traffic will be 
route to the last available container version. Container installation is 
a little bit described here: 
http://docs.openstack.org/developer/devstack/guides/lxc.html But needs 
to adapt for LXD 2.0.

And we have to persuade the infra team to provide 16.04 VM :-)
Let me know what do you think.

kind regards

Frank


___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra


Re: [OpenStack-Infra] Work toward a translations checksite and call for help

2016-07-11 Thread Elizabeth K. Joseph
On Mon, Jul 11, 2016 at 5:59 AM, Vipul Nayyar  wrote:
> Hey Elizabeth,
>
> I'd like to contribute. :-)
>
> I have some past deployment and Ops experience and I'm really interested in
> building something of a blue green deployment system here, to decrease the
> downtime. Although, I'm still going through the infra related docs which I'm
> fairly new to, but with a little bit of guidance early on, I'll be happy to
> take over some responsibilities over time.
>
> Maybe a good place for me to start might be, to have a deep look at the
> puppet module written by Frank and probably noting down the most common
> errors that are encountered regularly. I'd like to hear more concrete
> thoughts from the community about how to proceed on this, if any.

Thank you for volunteering to help!

The following is quick rundown of how I've been testing Frank's
changes, which is also a bit of a crash course in how we test our
Puppet work, which is valuable to learn:

First of all, I was testing on public cloud instances with 8G of RAM
running Ubuntu 14.04, but I no longer have access to the one I was
using. I now test this on a local KVM instance with 8G of RAM.

As for testing itself, you'll want to follow our instructions for
"Making a change in Puppet" here:
http://docs.openstack.org/infra/system-config/sysadmin.html#making-a-change-in-puppet

Put something like the following in the local.pp:
http://paste.openstack.org/show/489372/

Before running the ./install_modules.sh command, apply Frank's
https://review.openstack.org/#/c/276466/ to your cloned
/root/system-config with git fetch, which will be something like, as
root:

cd system-config/
git fetch https://review.openstack.org/openstack-infra/system-config
refs/changes/66/276466/9 && git checkout FETCH_HEAD

...you can get this fetch link from Gerrit, at the top right of
https://review.openstack.org/#/c/276466/ where it says "Download" and
has a drop down menu with all the links.

Then you can continue with install_modules and the puppet apply command.

If your system or internet connection is on the slow side, you may
also need to bump the timeout in the checksite module, which I did in
this patch: https://review.openstack.org/#/c/337912/

Since I really could use the help running these tests and improving
the fault tolerance of this module, I really appreciate your effort.
Please mail the list here or grab me on IRC (I'm pleia2 in
#openstack-infra on freenode) if you need any help. Collecting error
messages folks run into here will help us too.

-- 
Elizabeth Krumbach Joseph || Lyz || pleia2

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra


Re: [OpenStack-Infra] Work toward a translations checksite and call for help

2016-07-11 Thread Vipul Nayyar
Hey Elizabeth,

I'd like to contribute. :-)

I have some past deployment and Ops experience and I'm really interested in
building something of a blue green deployment system here, to decrease the
downtime. Although, I'm still going through the infra related docs which
I'm fairly new to, but with a little bit of guidance early on, I'll be
happy to take over some responsibilities over time.

Maybe a good place for me to start might be, to have a deep look at the
puppet module written by Frank and probably noting down the most common
errors that are encountered regularly. I'd like to hear more concrete
thoughts from the community about how to proceed on this, if any.

Thanks
Vipul Nayyar
___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra


[OpenStack-Infra] Work toward a translations checksite and call for help

2016-07-06 Thread Elizabeth K. Joseph
Hi everyone,

I brought this up a few meetings ago, but I wanted to collect the
thoughts in one place to more easily get infra team input on the
status of work toward a translations checksite for the i18n team. As
some history, the i18n team wrote a specification a while back which
we approved, which folks can read for background:
http://specs.openstack.org/openstack-infra/infra-specs/specs/translation_check_site.html

The original assignees were mostly i18n people, and have been pulled
off to other things. As one of the primary infra liaisons with the
i18n team I've been pulled into helping, but my ability to help is
limited due to time and need for collaboration with some other infra
folks on some decisions. So here I am emailing the rest of the team
for help. Plus we also wanted to bring the conversations happening
privately about roadblocks to happen publicly so I don't continue to
be a blocker here.

Over the past several months Frank Kloeker worked to write a
preliminary Puppet module for us in puppet-translation_checksite (now
merged) and he has an outstanding corresponding system-config patch:
https://review.openstack.org/#/c/276466/

As the spec outlines, the assumption was that we'd run this on a
long-lived server in some way, updating the translation strings
directly from Zanata daily, and re-installing DevStack once a week.
We've run into a few issues with this, which I'd appreciate some
thoughts about so I have some help evaluating how to move forward.

1. The Puppet module is really fragile. In theory it works, Frank did
a good job with it. But almost every time I run it I run into another
problem. Sometimes it has to do with a DevStack error (there was a
known problem a couple times when I tried to run it), or trouble with
my environment (DevStack doesn't fail gracefully if a dependency is
not satisfied due to network timeout or whatnot) and sometimes it's
just a change in our infra that breaks things (yesterday it was an
unexpected problem with the puppet apt module).

The module itself doesn't yet have any recovery for any of this. If we
had DevStack running along well for a week, and it gets to the next
week and it fails to build, we're stuck with a broken system and no
notification that it's broken. We could spend time building fault
tolerance and build failure alerts into it, but I want to make sure
we're on the right track first.

2. We don't actually have a solution to run "new" DevStack once a
week. Some options:

 - The once a week rebuild is just known downtime for the checksite,
have a cron job to ./unstack and delete /opt/devstack?
 - Get to a place we're we're auto-building new servers, and just
build a new one and swap DNS once a week once we know the new server
also is running properly with something like a health script that must
pass
 - Something else?

3. It takes a long time to run DevStack's stack.sh, which this module
does. Current timeout is 3600 (1 hour), but I have to bump it up to
run it locally in my tests. Even at an hour, this will really gum up
the works if it's part of system-config and running alongside all our
other ansible+puppet runs, even if the building of DevStack is only
once a week. Is this acceptable to us?

4. While we will have i18n team members logging into the Horizon
interface to see the progress of their translations work (that's the
whole point), the translations checksite is essentially read-only and
we have a pretty good mechanism in place for spinning up daily
DevStack instances for all our tests. Maybe we should back-peddle and
somehow leverage this tooling instead?

Thanks everyone.

-- 
Elizabeth Krumbach Joseph || Lyz || pleia2

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra