Re: [openstack-dev] [tripleo][pre] removing default ssh rule from tripleo::firewall::pre

2018-07-13 Thread Lars Kellogg-Stedman
On Fri, Jul 13, 2018 at 07:47:17AM -0600, Alex Schultz wrote:
> I think we should update the default rule to allow access over the
> control plane but there must be at least 1 rule that we're enforcing
> exist so the deployment and update processes will continue to
> function.

That's makes sense. I'll update the review with that change.

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo][pre] removing default ssh rule from tripleo::firewall::pre

2018-07-12 Thread Lars Kellogg-Stedman
I've had a few operators complain about the permissive rule tripleo
creates for ssh.  The current alternatives seems to be to either disable
tripleo firewall management completely, or move from the default-deny
model to a set of rules that include higher-priority blacklist rules
for ssh traffic.

I've just submitted a pair of reviews [1] that (a) remove the default
"allow ssh from everywhere" rule in tripleo::firewall:pre and (b) add
a DefaultFirewallRules parameter to the tripleo-firewall service.

The default value for this new parameter is the same rule that was
previously in tripleo::firewall::pre, but now it can be replaced by an
operator as part of the deployment configuration.

For example, a deployment can include:

parameter_defaults:
  DefaultFirewallRules:
tripleo.tripleo_firewall.firewall_rules:
  '003 allow ssh from internal networks':
source: '172.16.0.0/22'
proto: 'tcp'
dport: 22
  '003 allow ssh from bastion host':
source: '192.168.1.10'
proto: 'tcp'
dport: 22

[1] 
https://review.openstack.org/#/q/topic:feature/firewall%20(status:open%20OR%20status:merged)

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [puppet][tripleo] Why is this acceptance test failing?

2018-07-04 Thread Lars Kellogg-Stedman
On Wed, Jul 04, 2018 at 07:51:20PM -0600, Emilien Macchi wrote:
> The actual problem is that the manifest isn't idempotent anymore:
> http://logs.openstack.org/47/575147/16/check/puppet-openstack-beaker-centos-7/3f70cc9/job-output.txt.gz#_2018-07-04_00_42_19_705516

Hey Emilien, thanks for taking a look. I'm not following -- or maybe
I'm just misreading the failure message.  It really looks to me as if
the failure is caused by a regular expression; it says:

  Failure/Error:
apply_manifest(pp, :catch_changes => true) do |result|
  expect(result.stderr)
.to include_regexp([/Puppet::Type::Keystone_tenant::ProviderOpenstack: 
Support for a resource without the domain.*using 'Default'.*default domain id 
is '/])
end

And yet, the regular expression in that check clearly matches the
output shown in the failure message. What do you see that points at an
actual idempotency issue?

(I wouldn't be at all surprised to find an actual problem in this
change; I've fixed several already.  I'm just not sure how to turn
this failure into actionable information.)

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [puppet][tripleo] Why is this acceptance test failing?

2018-07-03 Thread Lars Kellogg-Stedman
I need another set of eyes.

I have a review that keeps failing here:

  
http://logs.openstack.org/47/575147/16/check/puppet-openstack-beaker-centos-7/3f70cc9/job-output.txt.gz#_2018-07-04_00_42_19_696966

It's looking for the regular expression:

  /Puppet::Type::Keystone_tenant::ProviderOpenstack: Support for a resource 
without the domain.*using 'Default'.*default domain id is '/

The output shown in the failure message contains:

  [1;33mWarning: Puppet::Type::Keystone_tenant::ProviderOpenstack:
  Support for a resource without the domain set is deprecated in
  Liberty cycle. It will be dropped in the M-cycle. Currently using
  'Default' as default domain name while the default domain id is
  '7ddf1dfa7fac46679ba7ae2245bece2f'.[0m

The regular expression matches the text! The failing test is here:

  
https://github.com/openstack/puppet-keystone/blob/master/spec/acceptance/default_domain_spec.rb#L59

I've been staring at this for a while and I'm not sure what's going
on.

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Puppet] Requirements for running puppet unit tests?

2018-07-02 Thread Lars Kellogg-Stedman
On Thu, Jun 28, 2018 at 8:04 PM, Lars Kellogg-Stedman 
wrote:

> What is required to successfully run the rspec tests?


On the odd chance that it might be useful to someone else, here's the
Docker image I'm using to successfully run the rspec tests for
puppet-keystone:

  https://github.com/larsks/docker-image-rspec

Available on docker hub  as larsks/rspec.

Cheers,

-- 
Lars Kellogg-Stedman 
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] DeployArtifacts considered...complicated?

2018-06-28 Thread Lars Kellogg-Stedman
On Tue, Jun 19, 2018 at 05:17:36PM +0200, Jiří Stránský wrote:
> For the puppet modules specifically, we might also add another
> directory+mount into the docker-puppet container, which would be blank by
> default (unlike the existing, already populated /etc/puppet and
> /usr/share/openstack-puppet/modules). And we'd put that directory at the
> very start of modulepath. Then i *think* puppet would use a particular
> module from that dir *only*, not merge the contents with the rest of
> modulepath...

No, you would still have the problem that types/providers from *all*
available paths are activated, so if in your container you have
/etc/puppet/modules/themodule/lib/puppet/provider/something/foo.rb,
and you mount into the container
/container/puppet/modules/themodule/lib/puppet/provider/something/bar.rb,
then you end up with both foo.rb and bar.rb active and possibly
conflicting.

This only affects module lib directories. As Alex pointed out, puppet
classes themselves behave differently and don't conflict in this
fashion.

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] DeployArtifacts considered...complicated?

2018-06-28 Thread Lars Kellogg-Stedman
On Tue, Jun 19, 2018 at 10:12:54AM -0600, Alex Schultz wrote:
> -1 to more services. We take a Heat time penalty for each new
> composable service we add and in this case I don't think this should
> be a service itself.  I think for this case, it would be better suited
> as a host prep task than a defined service.  Providing a way for users
> to define external host prep tasks might make more sense.

But right now, the only way to define a host_prep_task is via a
service template, right?  What I've done for this particular case is
create a new service template that exists only to provide a set of
host_prep_tasks:

  
https://github.com/CCI-MOC/rhosp-director-config/blob/master/templates/services/patch-puppet-modules.yaml

Is there a better way to do this?

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Puppet] Requirements for running puppet unit tests?

2018-06-28 Thread Lars Kellogg-Stedman
Hey folks,

I'm looking for some guidance on how to successfully run rspec tests
for openstack puppet modules (specifically, puppet-keystone).  I
started with CentOS 7, but running the 'bundle install command' told
me:

  Gem::InstallError: public_suffix requires Ruby version >= 2.1.
  An error occurred while installing public_suffix (3.0.2), and Bundler cannot 
continue.
  Make sure that `gem install public_suffix -v '3.0.2'` succeeds before 
bundling.

So I tried it on my Fedora 28 system, and while the 'bundle install'
completed successfully, running `bundle exec rake lint` told me:

  $ bundle exec rake lint
  
/home/lars/vendor/bundle/ruby/2.4.0/gems/puppet-2.7.26/lib/puppet/util/monkey_patches.rb:93:
 warning: constant ::Fixnum is deprecated
  rake aborted!
  NoMethodError: undefined method `<<' for nil:NilClass

...followed by a traceback.

So then I tried it on Ubuntu 18.04, and the bundle install fails with:

  Gem::RuntimeRequirementNotMetError: grpc requires Ruby version < 2.5, >= 2.0. 
The current ruby
  version is 2.5.0.
  An error occurred while installing grpc (1.7.0), and Bundler cannot continue.

And finally I tried Ubuntu 17.10.  The bundle install completed
successfully, but the 'rake lint' failed with:

  $ bundle exec rake lint
  
/home/lars/vendor/bundle/ruby/2.3.0/gems/puppet-2.7.26/lib/puppet/defaults.rb:164:
 warning: key :queue_type is duplicated and overwritten on line 165
  rake aborted!
  can't modify frozen Symbol

What is required to successfully run the rspec tests?

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] Referring to the --templates directory?

2018-06-25 Thread Lars Kellogg-Stedman
Is there a way to refer to the `--templates` directory when writing
service templates?  Existing service templates can use relative paths,
as in:

resources:

  ContainersCommon:
type: ./containers-common.yaml

But if I'm write a local service template (which I often do during
testing/development), I would need to use the full path to the
corresponding file:

  ContainersCommon:
type: 
/usr/share/openstack-tripleo-heat-templates/docker/services/containers-common.yaml

But that breaks if I use another template directory via the
--templates option to the `openstack overcloud deploy` command.  Is
there a way to refer to "the current templates directory"?

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] 'overcloud deploy' doesn't restart haproxy (Pike)

2018-06-20 Thread Lars Kellogg-Stedman
I've noticed that when updating the overcloud with 'overcloud deploy',
the deploy process does not restart the haproxy containers when there
are changes to the haproxy configuration.

Is this expected behavior?

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] DeployArtifacts considered...complicated?

2018-06-19 Thread Lars Kellogg-Stedman
On Tue, Jun 19, 2018 at 02:18:38PM +0100, Steven Hardy wrote:
> Is this the same issue Carlos is trying to fix via
> https://review.openstack.org/#/c/494517/ ?

That solves part of the problem, but it's not a complete solution.
In particular, it doesn't solve the problem that bit me: if you're
changing puppet providers (e.g., replacing
provider/keystone_config/ini_setting.rb with
provider/keystone_config/openstackconfig.rb), you still have the old
provider sitting around causing problems because unpacking a tarball
only *adds* files.

> Yeah I think we've never seen this because normally the
> /etc/puppet/modules tarball overwrites the symlink, effectively giving
> you a new tree (the first time round at least).

But it doesn't, and that's the unexpected problem: if you replace the
/etc/puppet/modules/keystone symlink with a directory, then
/usr/share/openstack-puppet/modules/keystone is still there, and while
the manifests won't be used, the contents of the lib/ directory will
still be active.

> Probably we could add something to the script to enable a forced
> cleanup each update:
> 
> https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/deploy-artifacts.sh#L9

We could:

(a) unpack the replacement puppet modules into a temporary location,
  then

(b) for each module; rm -rf the target directory and then copy it into
  place

But! This would require deploy_artifacts.sh to know that it was
unpacking puppet modules rather than a generic tarball.

> This would have to be optional, so we could add something like a
> DeployArtifactsCleanupDirs parameter perhaps?

If we went with the above, sure.

> One more thought which just occurred to me - we could add support for
> a git checkout/pull to the script?

Reiterating our conversion in #tripleo, I think rather than adding a
bunch of specific functionality to the DeployArtifacts feature, it
would make more sense to add the ability to include some sort of
user-defined pre/post tasks, either as shell scripts or as ansible
playbooks or something.

On the other hand, I like your suggestion of just ditching
DeployArtifacts for a new composable service that defines
host_prep_tasks (or re-implenting DeployArtifacts as a composable
service), so I'm going to look at that as a possible alternative to
what I'm currently doing.

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Puppet debugging help?

2018-06-18 Thread Lars Kellogg-Stedman
On Mon, Jun 18, 2018 at 11:31:08AM -0400, Mohammed Naser wrote:
> Hey Lars,
> 
> Do you have a full job that's running which shows those issues?

I don't. I have a local environment where I'm doing my testing.

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Puppet debugging help?

2018-06-18 Thread Lars Kellogg-Stedman
Hey folks,

I'm trying to patch puppet-keystone to support multi-valued
configuration options (like trusted_dashboard).  I have a patch that
works, mostly, but I've run into a frustrating problem (frustrating
because it would seem to be orthogonal to my patches, which affect the
keystone_config provider and type).

During the initial deploy, running tripleo::profile::base::keystone
fails with:

  "Error: Could not set 'present' on ensure: undefined method `new'
  for nil:NilClass at
  /etc/puppet/modules/tripleo/manifests/profile/base/keystone.pp:274",
 
The line in question is:

  70: if $step == 3 and $manage_domain {
  71:   if hiera('heat_engine_enabled', false) {
  72: # create these seperate and don't use ::heat::keystone::domain since
  73: # that class writes out the configs
  74: keystone_domain { $heat_admin_domain:
ensure  => 'present',
enabled => true
  }

The thing is, despite the error...it creates the keystone domain
*anyway*, and a subsequent run of the module will complete without any
errors.

I'm not entirely sure that the error is telling me, since *none* of
the puppet types or providers have a "new" method as far as I can see.
Any pointers you can offer would be appreciated.

Thanks!

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] DeployArtifacts considered...complicated?

2018-06-15 Thread Lars Kellogg-Stedman
I've been working on a series of patches to enable support for
keystone federation in tripleo.  I've been making good use of the
DeployArtifacts support for testing puppet modules...until today.

I have some patches that teach puppet-keystone about multi-valued
configuration options (like trusted_dashboard).  They replace the
keystone_config provider (and corresponding type) with ones that work
with the 'openstackconfig' provider (instead of ini_settings).  These
work great when I test them in isolation, but whenever I ran them as
part of an "overcloud deploy" I would get erroneous output.

After digging through the various layers I found myself looking at
docker-puppet.py [1], which ultimately ends up calling puppet like
this:

  puppet apply ... 
--modulepath=/etc/puppet/modules:/usr/share/openstack-puppet/modules ...

It's that --modulepath argument that's the culprit.  DeployArtifacts
(when using the upload-puppet-modules script) works by replacing the
symlinks in /etc/puppet/modules with the directories from your upload
directory.  Even though the 'keystone' module in /etc/puppet/modules
takes precedence when doing something like 'include ::keystone', *all
the providers and types* in lib/puppet/* in
/usr/share/openstack-puppet/modules will be activated.

So in this case -- in which I've replaced the keystone_config
provider -- we get the old ini_settings provider, and I don't get the
output that I expect.

The quickest workaround is to generate the tarball by hand and map the
modules onto /usr/share/openstack-puppet/modules...

  tar -cz -f patches/puppet-modules.tar.gz \
--transform "s|patches/puppet-modules|usr/share/openstack-puppet/modules|" \
patches/puppet-modules

...and then use upload-swift-artifacts:

upload-swift-artifacts -f patches/puppet-modules.tar.gz

Done this way, I get the output I expect.

[1]: 
https://github.com/openstack/tripleo-heat-templates/blob/master/docker/docker-puppet.py

-- 
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][deployment][kolla][tripleo][osa] Service diagnostics task force

2017-09-19 Thread Lars Kellogg-Stedman
On Wed, Sep 13, 2017 at 7:45 PM, Michał Jastrzębski 
wrote:

> We would to ask for volunteer project team to join us and spearhead this
> effort.
>
>
I would certainly be interested in this effort.

-- 
Lars Kellogg-Stedman 
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] fluentd integration cleanup

2017-09-08 Thread Lars Kellogg-Stedman
[tl;dr] shardy, these are the fluentd changes we discussed.

The original fluentd integration in tripleo landed just before before
service_config_settings, so it used some janky changes in
common/services.yaml in order to aggregate configuration information from
other services in the service chain.

With the availability of service_config_settings this is no longer
necessary.  I've started the work to clean up the fluentd integration to
use the new mechanism.

The work is being tracked in https://bugs.launchpad.net/tripleo/+bug/1715187
and you can find all the changes at
https://review.openstack.org/#/q/topic:bug/1715187. There are changes to
tripleo-heat-templates and to puppet-tripleo.

Everything is currently marked [WIP]. While the changes Work For Me, I know
that there is ongoing work to support the Pike release and these changes in
particular will conflict with some work that jbadiapa is doing for
containerized fluentd in Pike.  I'd like to make sure that Pike has settled
before landing these.

-- 
Lars Kellogg-Stedman 
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Logging in containerized services

2017-07-20 Thread Lars Kellogg-Stedman
On Wed, Jul 19, 2017 at 4:53 AM, Mark Goddard  wrote:

> Kolla-ansible went through this process a few years ago, and ended up with
> a solution involving heka pulling logs from files in a shared docker volume
> (kolla_logs)


That's basically the same solution that we're currently using.  I'm
specifically recommending a solution that moves away from tailing log files
and towards a /dev/log based logging interface (and I'm suggesting we use
rsyslog for log gathering logs and shipping them to a remote point because
the distribution packaging for that is very mature and there's a good
chance that it's already running, particularly in tripleo target
environments).

-- 
Lars Kellogg-Stedman 
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] Logging in containerized services

2017-07-18 Thread Lars Kellogg-Stedman
Our current model for logging in a containerized deployment has pretty much
everything logging to files in a directory that has been bind-mounted from
the host.  This has some advantages: primarily, it makes it easy for an
operator on the local system to find logs, particularly if they have had
some previous exposure to non-containerized deployments.

There is strong demand for a centralized logging solution.  We've got one
potential solution right now in the form of the fluentd service introduced
in Newton, but this requires explicit registration of log files for every
service.  I don't think it's an ideal solution, and I would like to explore
some alternatives.

Logging via syslog
==

For the purposes of the following, I'm going to assume that we're deploying
on an EL-variant (RHEL/CentOS/etc), which means (a) journald owns /dev/log
and (b) we're running rsyslog on the host and using the omjournal plugin to
read messages from journald.

If we bind mount /dev/log into containers and configure openstack services
to log via syslog rather than via files, we get the following for free:

- We get message-based rather than line-based logging.  This means that
multiline tracebacks are handled correctly.

- A single point of collection for logs.  If your host has been configured
to ship logs to a centralized collector, logs from all of your services
will be sent there without any additional configuration.

- We get per-service message rate limiting from journald.

- Log messages are annotated by journald with a variety of useful metadata,
including the container id and a high resolution timestamp.

- We can configure the syslog service on the host to continue to write
files into legacy locations, so an operator looking to run grep against
local log files will still have that ability.

- Ryslog itself can send structured messages directly to an Elastic
instance, which means that in a many deployments we would not require
fluentd and its dependencies.

- This plays well in environments where some services are running in
containers and others are running on the host, because everything simply
logs to /dev/log.

Logging via stdin/stdout
==

A common pattern in the container world is to log everything to
stdout/stderr.  This has some of the advantages of the above:

- We can configure the container orchestration service to send logs to the
journal or to another collector.

- We get a different set of annotations on log messages.

- This solution may play better with frameworks like Kubernetes that tend
to isolate containers from the host a little more than using Docker or
similar tools straight out of the box.

But there are some disadvantages:

- Some services only know how to log via syslog (e.g., swift and haproxy)

- We're back to line-based vs. message-based logging.

- It ends up being more difficult to expose logs at legacy locations.

- The container orchestration layer may not implement the same message rate
limiting we get with fluentd.

Based on the above, I would like to suggest exploring a syslog-based
logging model moving forward. What do people think about this idea? I've
started putting together a spec at https://review.openstack.org/#/c/484922/
and I would welcome your input.

Cheers,

-- 
Lars Kellogg-Stedman 
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] Implementing container healthchecks

2017-07-13 Thread Lars Kellogg-Stedman
We [1] have started work on implementing support for
https://blueprints.launchpad.net/tripleo/+spec/container-healthchecks in
tripleo-common.  I would like to describe the approach we're taking in the
short term, as well as explore some ideas for longer-term implementations.

## Our current approach

We will be implementing service health checks in the 'healthcheck'
directory of tripleo-common.  Once the checks are merged and available in
distribution packages, we will then
modify container-images/tripleo_kolla_template_overrides.j2 to activate
specific checks for containerized services.  A typical modification would
look something like:

  {% block nova_api_footer %}
  RUN install -D -m 755 /usr/share/tripleo-common/healthcheck/nova-api
/openstack/healthcheck
  HEALTHCHECK CMD /openstack/healthcheck
  {% endblock %}

That copies the specific healthcheck command to /openstack/healthcheck, and
then configures docker to run that check using the HEALTHCHECK directive.

This approach has the advantage of keeping all the development work within
tripleo-common for now.

If you are unfamiliar with Docker's HEALTHCHECK feature:

Docker will run this command periodically inside the container, and will
expose the status reported by the script (0 - healthy, 1 - unhealthy) via
the Docker API.  This is visible in the output of 'docker ps', for example:

  $ docker ps
  ... STATUS ...
  Up 8 minutes (healthy)

Details at:
https://docs.docker.com/engine/reference/builder/#healthchecDetails at:

## Looking to the future

Our initial thought was that moving forward, these checks could be
implemented through the Kolla project.  However, Martin André suggested
(correctly) that these checks would also be of interest outside of Kolla.
The thought right now is that at some point in the future, we would split
the checks out into a separate project to make them more generally
consumable.

## Reviews

You can see the proposed changes here:

- https://review.openstack.org/#/q/topic:bp/container-healthchecks+is:open

Specifically, the initial framework is provided in:

- https://review.openstack.org/#/c/483081/

And an initial set of checks is in:

- https://review.openstack.org/#/c/483104/

Please feel to review and comment. While we are reasonably happy with the
solution proposed in this email, we are open to improvements.  Thanks for
your input!

[1] Initially, Dan Prince, Ian Main, Martin Mágr, Lars Kellogg-Stedman

-- 
Lars Kellogg-Stedman 
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Lars Kellogg-Stedman
On Fri, Jul 7, 2017 at 1:50 PM, James Slagle  wrote:

> There are also some ideas forming around pulling the Ansible playbooks
>
and vars out of Heat so that they can be rerun (or run initially)
> independently from the Heat SoftwareDeployment delivery mechanism:
>

I think the closer we can come to "the operator runs ansible-playbook to
configure the overcloud" the better, but not because I think Ansible is
inherently a great tool: rather, I think the many layers of indirection in
our existing model make error reporting and diagnosis much more complicated
that it needs to be.  Combined with Puppet's "fail as late as possible"
model, this means that (a) operators waste time waiting for a deployment
that is ultimately going to fail but hasn't yet, and (b) when it does fail,
they need relatively intimate knowledge of our deployment tools to
backtrack through logs and find the root cause of the failure.

If we can offer a deployment mode that reduces the number of layers between
the operator and the actions being performed on the hosts I think we would
win on both fronts: faster failures and reporting errors as close as
possible to the actual problem will result in less frustration across the
board.

I do like Steve's suggestion of a split model where Heat is responsible for
instantiating OpenStack resources while Ansible is used to perform host
configuration tasks.  Despite all the work done on Ansible's OpenStack
modules, they feel inflexible and frustrating to work with when compared to
Heat's state-aware, dependency ordered deployments.  A solution that allows
Heat to output configuration that can subsequently be consumed by Ansible
-- either running manually or perhaps via Mistral for
API-driven-deployments -- seems like an excellent goal.  Using Heat as a
"front-end" to the process means that we get to keep the parameter
validation and documentation that is missing in Ansible, while still
following the Unix philosophy of giving you enough rope to hang yourself if
you really want it.

-- 
Lars Kellogg-Stedman 
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] Providing interface-scoped nameservers in network_data.json

2017-06-05 Thread Lars Kellogg-Stedman
While investigating a bug report against cloud-init ("why don't you put
nameservers in interface configuration files?"). I discovered that Nova
munges the information received from Neutron to take the network-scoped
nameserver entries and move them all into a global "services" section.

It turns out that people may actually want to preserve the information
about which interface is associated with a particular nameserver so that
the system can be configured to manage the resolver configuration as
interfaces are brought up/down.

I've proposed https://review.openstack.org/#/c/467699/ to resolve this
issue, which adds nameserver information to the "network" section.  This
*does not* remove the global "services" key, so existing code that expects
to find nameservers there will continue to operate as it does now.  This
simply exposes the information in an additional location where there is
more context available.

Thanks for looking,

-- Lars
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo][tripleo-quickstart] Multiple parallel deployments on a single virthost

2017-03-03 Thread Lars Kellogg-Stedman
I've just submitted a slew of changes to tripleo-quickstart with the
ultimate goal of being able to spin up multiple openstack deployments
in parallel on the same target virthost.

The meat of the change is an attempt to clearly separate the virthost
from the undercloud; we had several tasks that worked only because the
user name (and working directory) happened to be the same in both
environments.

With these changes in place, I am able to rapidly deploy multiple
tripleo-deployments on a single virthost, each one isolated to a
particular user account.  I'm using a playbook that includes
just the libvirt/setup, undercloud-deploy, and overlcoud-* roles.
This is extremely convenient for some of the work that I'm doing now.

This does require some pre-configuration on the virthost (each user
gets their own overcloud bridge) and in the quickstart (each user gets
their own underlcoud_external_network_cidr).

- https://review.openstack.org/441559 modify basic test to not require 
quickstart-extras
- https://review.openstack.org/441560 use a non-default virthost_user for the 
basic test
- https://review.openstack.org/441561 restore support for multiple deployments 
on virthost
- https://review.openstack.org/441562 improve generated ssh configuration
- https://review.openstack.org/441563 derive overcloud_public_vip and 
overcloud_public_vip6
- https://review.openstack.org/441564 yaml cleanup and formatting
- https://review.openstack.org/441565 don't make ssh_config files executable
- https://review.openstack.org/441566 restrict bashate to files in repository
- https://review.openstack.org/441567 restrict pep8 to files in repository
- https://review.openstack.org/441568 fix ansible-lint error ANSIBLE0012
- https://review.openstack.org/439133 define non_root_group explicitly

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Tripleo] FFE for tripleo collectd integration

2017-02-03 Thread Lars Kellogg-Stedman
On Thu, Feb 02, 2017 at 01:37:20PM -0500, Emilien Macchi wrote:
> Could you patch your THT patch to Depends-On the tripleo-ci patch, so
> we can see if the package gets installed and if there is no blocker we
> might have missed.

I've added the Depends-On to the review.  Will that be sufficient such
that it will use images generated with the dependent patch?

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Tripleo] FFE for tripleo collectd integration

2017-02-02 Thread Lars Kellogg-Stedman
I would like to request a feature freeze exception for the collectd
composable service patch:

  
https://blueprints.launchpad.net/tripleo/+spec/tripleo-opstools-performance-monitoring

The gerrit review implementing this is:

  https://review.openstack.org/#/c/411048/

The work on the composable service has been largely complete for several
weeks, but there were some complication in getting tripleo ci to
generate appropriate images.  I believe that a recently submitted patch
will resolve that issue and unblock our ci:

  https://review.openstack.org/#/c/427802/

Once the above patch has merged and new overcloud images are generated I
believe the ci for the collectd integration will pass.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Adding a LateServices ResourceChain

2017-01-11 Thread Lars Kellogg-Stedman
> 2. Do the list manipulation in puppet, like we do for firewall rules
> 
> E.g see:
> 
> https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/ceilometer-api.yaml#L62
> 
> https://github.com/openstack/puppet-tripleo/blob/master/manifests/firewall/service_rules.pp#L32
> 
> This achieves the same logical result as the above, but it does the list
> manipulation in the puppet profile instead of t-h-t.
> 
> I think either approach would be fine, but I've got a slight preference for
> (1) as I think it may be more reusable in a future non-puppet world, e.g
> for container deployments etc where we may not always want to use puppet.
> 
> Open to other suggestions, but would either of the above solve your
> problem?

I went with (2), even though iteration in Puppet is a little funky.
Looking through the firewall rules implementation helped me
understand how the service_config_settings stuff works.

You can see the updated implementation at:

- https://review.openstack.org/#/c/417509/ (puppet-tripleo)
- https://review.openstack.org/#/c/411048/ (t-h-t)

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Adding a LateServices ResourceChain

2017-01-03 Thread Lars Kellogg-Stedman
On Fri, Dec 23, 2016 at 02:21:00PM +, Steven Hardy wrote:
> I commented on the bug, I'm not sure about this as it seems to overlap with
> our service_config_settings interface, which IIRC landed slightly after
> your previous patches for opstools things, and potentially provides a
> cleaner way to approach this.

I'm not sure I see how to apply that, but let me further describe the
use case and you can perhaps point me in the right direction.

> Perhaps you can point to some examples of this usage, then we can compare
> with the service_config_settings approach?
> 
> I suspect the main difference is you need to append data for each service
> to e.g the collectd configuration?

Let's take the existing Fluentd support as an example.  We want the
ability for every service to provide a logging source configuation for
Fluentd, which will get aggregated into a logging_sources
list and then ultimately used in puppet-tripleo to populate a series
of ::fluentd::config resources.

Currently, the aggregation happens in a combination of
puppet/services/services.yaml (which aggregates the logging_source
attribute from all the services in the service chain) and in
overcloud.j2.yaml (which actually instantiates the hiera data).

With the LateServiceChain I've proposed, this could all be isolated
inside the fluentd composable service: it would not be necessary to
expose any of this logic in either services.yaml or overcloud.j2.yaml,
leading.  Additionally, it would not require the fluentd service to
have any a priori knowledge of what services were in use; it would
simply aggregate any configuration information that is provided in the
primary service chain.  It would also allow us to get rid of the
"LoggingConfiguration" resource, which only exists as a way to expose
certain parameter_defaults inside services.yaml.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] Adding a LateServices ResourceChain

2016-12-22 Thread Lars Kellogg-Stedman
I've been working along with a few others on some "opstools"
composable services for TripleO: that is, services that provide
things like centralized logging, performance monitoring, or
availability/health monitoring.

We've been running into a persistent problem with TripleO's
architecture: our services in many cases need configuration
information to be provided by other services in the stack, but there's
no way to introspect this data from inside a composable service
template.  This has led to some rather messy and invasive changes to
things like puppet/services/services.yaml.

In https://review.openstack.org/#/c/413748/, I've proposed the
addition of a secondary chain of services called LateServiceChain.
This is, like the existing ServiceChain resource, just a Heat
ResourceChain.  Unlike the existing ServiceChain, it receives an
additional "RoleData" parameter that contains the role_data outputs
from all the services realized in ServiceChain.

This permits composable services in the LateServices chain to access
per-service configuration information provided by the other services,
leading to much cleaner implementations of these auxiliary services.

I am attempting to use this right now for a collectd composable
service implementation, but this model would ultimately allow us to
remove several of the changes made in services.yaml to support Sensu
and Fluentd and put them back into the appropriate composable service
templates.

I'd appreciate your feedback on this idea.  Thanks!

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo][heat] Hacky validation of get_param

2016-12-22 Thread Lars Kellogg-Stedman
Maybe my fingers are flubbier than most, but I keep running into this
problem:

1. I have a typo in a get_param call somewhere in a composable service
   template.
2. Tripleo validation fails, but produces an error for a higher level
   {role.name}ServiceChain template because it ultimately wasn't able
   to resolve the outputs of the nested composable service template.

Because the error is identified at the wrong place and contains no
useful information about the root cause, it means you lose a bunch of
time manually inspecting templates for errors.

While there is a bug on this
(https://bugs.launchpad.net/heat/+bug/1599114), I wanted something now
that I could at least stick in my pre-commit hooks to identify really
obvious problems.  And here it is:

  https://github.com/larsks/heat-check-params

That will validate all calls to get_param in a template against the
parameter names defined in the 'parameters' section. It's not really
very pretty, but it has at least prevented me from committing some
changes with egregious spelling errors.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] [tripleo-quickstart] Tripleo-Quickstart root privileges

2016-12-01 Thread Lars Kellogg-Stedman
On Thu, Dec 01, 2016 at 09:03:30AM -0500, John Trowbridge wrote:
> 1. Doing tasks as root on the virthost makes clean up trickier. With the
> current model, deleting the non-root quickstart user cleans up almost
> everything. By keeping all of the root privilege tasks in the provision
> and environment roles, it is much easier to reason about the few things
> that do not get cleaned up when deleting the quickstart user. If we
> start allowing root privilege tasks in the libvirt role, this will be
> harder.
> 
> 2. Theoretically, (I have not actually heard anyone actually doing
>this), someone could set up a virthost for use by quickstart, and
>then...

The particular use case that inspired the current architecture was the
situation in which people did not want a random script from the
internet running with privileges on their system.

The existing model means that you can manually configure a host for
use by quickstart (installing libvirt, creating the necessary bridges
devices and permissions, etc), and then use quickstart exclusively as
a non-root user.

This is really nice for a number of reasons.  For example, I often
have multiple quickstart-provisioned environments on my virt host,
each associated with a particular user.  Being able to run everything
as a non-root user means that it's easy to keep these separate, and
that I won't accidentally break one environment because of a typo or
something (because my "master tripleo" user is not able to modify the
environment of my "rdo release" user).

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] fluentd client composable service (request for review)

2016-08-16 Thread Lars Kellogg-Stedman
Howdy,

I'm working on a composable service to install the fluentd client
across the overcloud (and provide appropriate configuration files to
pull in relevant openstack logfiles).

There are two* reviews pending right now:

- in tripleo-heat-templates: https://review.openstack.org/#/c/353506
- in puppet-tripleo: https://review.openstack.org/#/c/353507

I am looking for someone on the tripleo team to take a quick look at how
this is laid out and give a thumbs-up or thumbs-down on the current
design.

Thanks,

* There is also a corresponding spec which should be posted soon.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] additional git repo(s) for tripleo-quickstart

2016-08-10 Thread Lars Kellogg-Stedman
On Wed, Aug 10, 2016 at 03:26:18PM -0400, Wesley Hayutin wrote:
> I'm proposing the creation of a repo called tripleo-quickstart-extras that
> would contain some or all of the current third party roles used with
> TripleO-Quickstart.

Which roles in particular would you place in this -extras repository?
One of our goals in moving roles *out* of the quickstart was to move
them into a one-repository-per-role model that makes things easily
composable (install only those roles you need) and that
compartmentalizes related sets of changes.

Is this just a convenience for a bunch of roles that are typically
installed together?

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Proposing Gabriele Cerami for tripleo-quickstart core

2016-07-18 Thread Lars Kellogg-Stedman
On Mon, Jul 18, 2016 at 11:06:53AM -0400, John Trowbridge wrote:
> I would like to propose Gabriele (panda on IRC), for tripleo-quickstart
> core. He has worked on some pretty major features for the project
> (explicit teardown, devmode), and has a good understanding of the code base.

+2 from me.  Gabriel has done some good work and he has put up with
my reviews :).

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Nova scheduler startup when database is not available

2015-12-23 Thread Lars Kellogg-Stedman
I've been looking into the startup constraints involved when launching
Nova services with systemd using Type=notify (which causes systemd to
wait for an explicit notification from the service before considering
it to be "started".  Some services (e.g., nova-conductor) will happily
"start" even if the backing database is currently unavailable (and
will enter a retry loop waiting for the database).

Other services -- specifically, nova-scheduler -- will block waiting
for the database *before* providing systemd with the necessary
notification.

nova-scheduler blocks because it wants to initialize a list of
available aggregates (in scheduler.host_manager.HostManager.__init__),
which it gets by calling objects.AggregateList.get_all.

Does it make sense to block service startup at this stage?  The
database disappearing during runtime isn't a hard error -- we will
retry and reconnect when it comes back -- so should the same situation
at startup be a hard error?  As an operator, I am more interested in
"did my configuration files parse correctly?" at startup, and would
generally prefer the service to start (and permit any dependent
services to start) even when the database isn't up (because that's
probably a situation of which I am already aware).

It would be relatively easy to have the scheduler lazy-load the list
of aggregates on first references, rather than at __init__.  I'm not
familiar enough with the nova code to know if there would be any
undesirable implications of this behavior.  We're already punting
initializing the list of instances to an asynchronous task in order to
avoid blocking service startup.

Does it make sense to permit nova-scheduler to complete service
startup in the absence of the database (and then retry the connection
in the background)?

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum] Continuing with heat-coe-templates

2015-07-15 Thread Lars Kellogg-Stedman
On Sun, Jul 05, 2015 at 12:00:55AM +, Steven Dake (stdake) wrote:
> Lars had a repo he maintains.  Magnum had a repo it maintained.  We
> wanted one source of truth.  The deal was we would merge all the
> things into heat-coe-templates, delete larsks/heat-kubernetes and
> delete the magnum templates.  Then there would be one source of
> truth.

I apologize for being out of the loop for a bit; I was stuck out at a
customer site for a while.

I create the heat-coe-templates project at the request of sdake
because it sounded as if (a) magnum wanted to make use of the
templates and have them in a location where there was a better
workflow for submitting and reviewing patches, and (b) magnum wanted
to take the templates in a different direction (with support for other
scheduling engines, etc).

After creating it, there was no activity on it so I stopped paying
attention for a while.  If folks want to use it, we should set up some
additional maintainers and go for it.

I'm going to continue maintaining my own repository as a
strictly-for-kubernetes tool.  I had to make a number of changes to it
recently in order to support a demo at the recent summit, and I am
happy to contribute some of these upstream.

In conclusion: I have very little skin in this game.  I am happy for
folks to make use of the templates if they are useful, and I am
totally happy to let other folks manage the heat-coe-templates
project and take it in a direction completely different from where
things are now.

I leave the decision about where things are going to someone who has a
more vested interest in the resolution.

Cheers,

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [magnum] heat-kubernetes is dead, long live heat-coe-templates

2015-03-31 Thread Lars Kellogg-Stedman
Hello folks,

Late last week we completely the migration of
https://github.com/larsks/heat-kubernetes into stackforge, where you
can now access it as:

  https://github.com/stackforge/heat-coe-templates/

Bug reports can be filed in launchpad:

  https://bugs.launchpad.net/heat-coe-templates/+filebug

GitHub pull request against the original repository will no longer be
accepted; all changes can now go through the Gerrit review process
that we all know and love.

The only check implemented right now is a basic YAML linting process
to ensure that files are syntactically correct.

Cheers,

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgp4e0rmRGFeO.pgp
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Is yaml-devel still needed for Devstack

2015-03-27 Thread Lars Kellogg-Stedman
On Fri, Mar 27, 2015 at 09:01:12AM -0400, Adam Young wrote:
> I recently got Devstack to run on RHEL.  In doing so, I had to hack around
> the dependency on yaml-devel (I just removed it from devstack's required
> packages)
> 
> There is no yaml-devel in EPEL or the main repos for RHEL7.1/Centos7.

Fedora and CentOS (7) both have libyaml and libyaml-devel.  I wonder
if this is just a package naming issue in devstack?  libyaml-devel is
used by PyYAML to build C extensions, although PyYAML will also
operate without it.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgpHQYSbNavhr.pgp
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] penstack Heat- OS::Heat::MultipartMime cannot be used as user_data for OS::Nova::Server

2015-01-23 Thread Lars Kellogg-Stedman
On Thu, Jan 22, 2015 at 04:09:09PM -0700, Vignesh Kumar wrote:
> I am new to heat orchestration and am trying to create a coreOS cluster
> with it. I have a OS::Heat::SoftwareConfig resource and a
> OS::Heat::CloudConfig resource and I have joined them both in a
> OS::Heat::MultipartMime resource which is then used as a user_data for a
> OS::Nova::Server resource. Unfortunately I am not able to see the
> configurations happening in my server resource using cloud-init...

If I take your template and use it to boot a Fedora system instead of
a CoreOS system, it works as intended.  Note that CoreOS does *not*
use the same cloud-init that everyone else uses, and it is entirely
possible that the CoreOS cloud-init does not support multipart MIME
user-data.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgpIRZErym0lL.pgp
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [kolla] Refactored heat-kubernetes templates

2015-01-02 Thread Lars Kellogg-Stedman
Hello Kolla folks (et al),

I've refactored the heat-kubernetes templates at
https://github.com/larsks/heat-kubernetes to work with Centos Atomic
Host and Fedora 21 Atomic, and to replace the homegrown overlay
network solution with Flannel.

These changes are available on the "master" branch.

The previous version of the templates, which worked with F20 and
included some Kolla-specific networking logic, is available in the
"kolla" branch:

  https://github.com/larsks/heat-kubernetes/tree/kolla

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgpSeMghsxO2I.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] making Daneyon Hansen core

2014-10-22 Thread Lars Kellogg-Stedman
On Wed, Oct 22, 2014 at 08:04:24AM -0700, Steven Dake wrote:
> A few weeks ago in IRC we discussed the criteria for joining the core team
> in Kolla.  I believe Daneyon has met all of these requirements by reviewing
> patches along with the rest of the core team and providing valuable
> comments, as well as implementing neutron and helping get nova-networking
> implementation rolling.
> 
> Please vote +1 or -1 if your kolla core.  Recall a -1 is a veto.  It takes 3
> votes.  This email counts as one vote ;)

+1

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgptJEhRve2dp.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] on Dockerfile patterns

2014-10-16 Thread Lars Kellogg-Stedman
On Fri, Oct 17, 2014 at 12:44:50PM +1100, Angus Lees wrote:
> You just need to find the pid of a process in the container (perhaps using 
> docker inspect to go from container name -> pid) and then:
>  nsenter -t $pid -m -u -i -n -p -w

Note also that the 1.3 release of Docker ("any day now") will sport a
shiny new "docker exec" command that will provide you with the ability
to run commands inside the container via the docker client without
having to involve nsenter (or nsinit).

It looks like:

docker exec  ps -fe

Or:

    docker exec -it  bash

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgph4nan8hDa3.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] on Dockerfile patterns

2014-10-15 Thread Lars Kellogg-Stedman
On Wed, Oct 15, 2014 at 01:50:08PM -0400, David Vossel wrote:
> Something like LSB init scripts except tailored towards the container use 
> case.
> The primary difference would be that the 'start' action of this new standard
> wouldn't fork. Instead 'start' would be pid 1. The 'status' could be checked
> externally by calling the exact same entry point script to invoke the 'status'
> function.

With the 1.3 release, which introduces "docker exec", you could just
about get there.  Rather than attempting to introspect the container
to find the entrypoint script -- which might not even exist -- I would
say standardize on some top level paths (e.g., '/status') that can be
used to run a status check, and leave the implementation of those
paths up to the image (maybe they're scripts, maybe they're binaries,
just as long as they are executable).

Then you check would boil down to:

  docker exec  /status

The reason why I am trying to avoid assuming some specially
constructed entrypoint script is that many images will simply not have
one -- they simply provide an initial command via CMD. Adding a
/status script or similar in this case is very simple:

   FROM original_image
   ADD status /status

Doing this via an entrypoint script would be a bit more complicated:

- You would have to determine whether or not the original image had an
  existing entrypoint script.
- If so you would need to wrap it or replicate the functionality.
- Some images may have entrypoint scripts that already provide
  "subcommand" like functionality (docker run someimage ,
  where  is parsed by the entrypoint script) and might not be
  compatible with an entrypoint-based status check.

Otherwise, I think establishing a "best practice" mechanism for executing
in-container checks is a great idea.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgpD1P2Vb3KMF.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] on Dockerfile patterns

2014-10-15 Thread Lars Kellogg-Stedman
On Wed, Oct 15, 2014 at 07:52:56AM -0700, Vishvananda Ishaya wrote:
> There must be a standard way
> to do this stuff or people will continue to build fat containers with
> all of their pet tools inside. This means containers will just be
> another incarnation of virtualization.

I wouldn't spend time worrying about that.  "docker exec" will be the
standard way as soon as it lands in a release version, which I think
will be happening imminently with 1.3.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgp6GOmzu0krD.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] on Dockerfile patterns

2014-10-14 Thread Lars Kellogg-Stedman
On Tue, Oct 14, 2014 at 04:06:22PM -0400, Jay Pipes wrote:
> I understand that general feeling, but system administration tasks like
> debugging networking issues or determining and grepping log file locations
> or diagnosing packaging issues for OpenStack services or performing database
> logfile maintenance and backups don't just go away because you're using
> containers, right?

They don't go away, but they're not necessarily things that you would
do inside your container.

Any state (e.g., database tables) that has a lifetime different from
that of your container should be stored outside of the container
proper.  In docker, this would be a "volume" (in a cloud environment,
this would be something like EBS or a Cinder volume).

Ideally, your container-optimized applications logs to stdout/stderr.
If you have multiple processes, they each run in a separate container.

Backups take advantage of the data volumes you've associated with your
container.  E.g., spawning a new container using the docker
"--volumes-from" option to access that data for backup purposes.

If you really need to get inside a container for diagnostic purposes,
then you use something like "nsenter", "nsinit", or the forthcoming
"docker exec".


> they very much seem to be developed from the point of view of application
> developers, and not so much from the point of view of operators who need to
> maintain and support those applications.

I think it's entirely accurate to say that they are
application-centric, much like services such as Heroku, OpenShift,
etc.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgp29hOhB_K2x.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] on Dockerfile patterns

2014-10-14 Thread Lars Kellogg-Stedman
On Tue, Oct 14, 2014 at 03:25:56PM -0400, Jay Pipes wrote:
> I think the above strategy is spot on. Unfortunately, that's not how the
> Docker ecosystem works.

I'm not sure I agree here, but again nobody is forcing you to use this
tool.

> operating system that the image is built for. I see you didn't respond to my
> point that in your openstack-containers environment, you end up with Debian
> *and* Fedora images, since you use the "official" MySQL dockerhub image. And
> therefore you will end up needing to know sysadmin specifics (such as how
> network interfaces are set up) on multiple operating system distributions.

I missed that part, but ideally you don't *care* about the
distribution in use.  All you care about is the application.  Your
container environment (docker itself, or maybe a higher level
abstraction) sets up networking for you, and away you go.

If you have to perform system administration tasks inside your
containers, my general feeling is that something is wrong.

> Sure, Docker isn't any more limiting than using a VM or bare hardware, but
> if you use the "official" Docker images, it is more limiting, no?

No more so than grabbing a virtual appliance rather than building a
system yourself.  

In other words: sure, it's less flexible, but possibly it's faster to
get started, which is especially useful if your primary goal is not
"be a database administrator" but is actually "write an application
that uses a database backend".

I think there are uses cases for both "official" and customized
images.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgpKaSDODdjVy.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] on Dockerfile patterns

2014-10-14 Thread Lars Kellogg-Stedman
On Tue, Oct 14, 2014 at 02:45:30PM -0400, Jay Pipes wrote:
> With Docker, you are limited to the operating system of whatever the image
> uses.

See, that's the part I disagree with.  What I was saying about ansible
and puppet in my email is that I think the right thing to do is take
advantage of those tools:

  FROM ubuntu

  RUN apt-get install ansible
  COPY my_ansible_config.yaml /my_ansible_config.yaml
  RUN ansible /my_ansible_config.yaml

Or:

  FROM Fedora

  RUN yum install ansible
  COPY my_ansible_config.yaml /my_ansible_config.yaml
  RUN ansible /my_ansible_config.yaml

Put the minimal instructions in your dockerfile to bootstrap your
preferred configuration management tool. This is exactly what you
would do when booting, say, a Nova instance into an openstack
environment: you can provide a shell script to cloud-init that would
install whatever packages are required to run your config management
tool, and then run that tool.

Once you have bootstrapped your cm environment you can take advantage
of all those distribution-agnostic cm tools.

In other words, using docker is no more limiting than using a vm or
bare hardware that has been installed with your distribution of
choice.

> [1] Is there an official MySQL docker image? I found 553 Dockerhub
> repositories for MySQL images...

Yes, it's called "mysql".  It is in fact one of the official images
highlighted on https://registry.hub.docker.com/.

> >I have looked into using Puppet as part of both the build and runtime
> >configuration process, but I haven't spent much time on it yet.
> 
> Oh, I don't think Puppet is any better than Ansible for these things.

I think it's pretty clear that I was not suggesting it was better than
ansible.  That is hardly relevant to this discussion.  I was only
saying that is what *I* have looked at, and I was agreeing that *any*
configuration management system is probably better than writing shells
cript.

> How would I go about essentially transferring the ownership of the RPC
> exchanges that the original nova-conductor container managed to the new
> nova-conductor container? Would it be as simple as shutting down the old
> container and starting up the new nova-conductor container using things like
> --link rabbitmq:rabbitmq in the startup docker line?

I think that you would not necessarily rely on --link for this sort of
thing.  Under kubernetes, you would use a "service" definition, in
which kubernetes maintains a proxy that directs traffic to the
appropriate place as containers are created and destroyed.

Outside of kubernetes, you would use some other service discovery
mechanism; there are many available (etcd, consul, serf, etc).

But this isn't particularly a docker problem.  This is the same
problem you would face running the same software on top of a cloud
environment in which you cannot predict things like ip addresses a
priori.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgp9SM_Y1OfTe.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] on Dockerfile patterns

2014-10-14 Thread Lars Kellogg-Stedman
On Tue, Oct 14, 2014 at 12:33:42PM -0400, Jay Pipes wrote:
> Can I use your Dockerfiles to build Ubuntu/Debian images instead of only
> Fedora images?

Not easily, no.

> Seems to me that the image-based Docker system makes the
> resulting container quite brittle -- since a) you can't use configuration
> management systems like Ansible to choose which operating system or package
> management tools you wish to use...

While that's true, it seems like a non-goal.  You're not starting with
a virtual machine and a blank disk here, you're starting from an
existing filesystem.

I'm not sure I understand your use case enough to give you a more
useful reply.

> So... what am I missing with this? What makes Docker images more ideal than
> straight up LXC containers and using Ansible to control upgrades/changes to
> configuration of the software on those containers?

I think that in general that Docker images are more share-able, and
the layered model makes building components on top of a base image
both easy and reasonably efficient in terms of time and storage.

I think that Ansible makes a great tool for managing configuration
inside Docker containers, and you could easily use it as part of the
image build process.  Right now, people using Docker are basically
writing shell scripts to perform system configuration, which is like a
20 year step back in time.  Using a more structured mechanism for
doing this is a great idea, and one that lots of people are pursuing.
I have looked into using Puppet as part of both the build and runtime
configuration process, but I haven't spent much time on it yet.

A key goal for Docker images is generally that images are "immutable",
or at least "stateless".  You don't "yum upgrade" or "apt-get upgrade"
in a container; you generate a new image with new packages/code/etc.
This makes it trivial to revert to a previous version of a deployment,
and clearly separates the "build the image" process from the "run the
application" process.

I like this model.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgpF7zc0dsOpi.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [kolla] Heat templates for kubernetes + docker

2014-10-14 Thread Lars Kellogg-Stedman
This came up briefly on the meeting yesterday, but I wanted to bring
it to a wider audience.

I know some folks out there are using the Heat templates I put
together for setting up a simple kubernetes environment.  I have
recently added support for the Gluster shared filesystem; you'll find
it in the "feature/gluster" branch:

  https://github.com/larsks/heat-kubernetes/tree/feature/gluster

Once everything is booted, you can create a volume:

  # gluster volume create mariadb replica 2 \
192.168.113.5:/bricks/mariadb 192.168.113.4:/bricks/mariadb
  volume create: mariadb: success: please start the volume to access data
  # gluster volume start mariadb
  volume start: mariadb: success

And then immediately access that volume under the "/gluster" autofs
mountpoint (e.g., "/gluster/mariadb").  You can use this in
combination with Kubernetes volumes to allocate storage to containers
that will be available on all of the minions.  For example, you could
use a pod configuration like this:

  desiredState:
manifest:
  volumes:
- name: mariadb-data
  source:
hostDir:
  path: /gluster/mariadb
  containers:
  - env:
- name: DB_ROOT_PASSWORD
  value: password
image: kollaglue/fedora-rdo-mariadb
name: mariadb
ports:
- containerPort: 3306
volumeMounts:
  - name: mariadb-data
mountPath: /var/lib/mysql
  id: mariadb-1
  version: v1beta1
  id: mariadb
  labels:
name: mariadb

With this configuration, you could kill the mariadb container, have it
created on other minion, and you would still have access to all the
data.

This is meant simply as a way to experiment with storage and
kubernetes.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgpBjVQ9JXPAB.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] on Dockerfile patterns

2014-10-14 Thread Lars Kellogg-Stedman
On Tue, Oct 14, 2014 at 02:51:15PM +1100, Angus Lees wrote:
> 1. It would be good if the "interesting" code came from python sdist/bdists 
> rather than rpms.

I agree in principal, although starting from packages right now lets
us ignore a whole host of issues.  Possibly we'll hit that change down
the road.

> 2. I think we should separate out "run the server" from "do once-off setup".
> 
> Currently the containers run a start.sh that typically sets up the database, 
> runs the servers, creates keystone users and sets up the keystone catalog.  
> In 
> something like k8s, the container will almost certainly be run multiple times 
> in parallel and restarted numerous times, so all those other steps go against 
> the service-oriented k8s ideal and are at-best wasted.

All the existing containers [*] are designed to be idempotent, which I
think is not a bad model.  Even if we move initial configuration out
of the service containers I think that is a goal we want to preserve.

I pursued exactly the model you suggest on my own when working on an
ansible-driven workflow for setting things up:

  https://github.com/larsks/openstack-containers

Ansible made it easy to support one-off "batch" containers which, as
you say, aren't exactly supported in Kubernetes.  I like your
(ab?)use of restartPolicy; I think that's worth pursuing.

[*] That work, which includes rabbitmq, mariadb, keystone, and glance.

> I'm open to whether we want to make these as lightweight/independent as 
> possible (every daemon in an individual container), or limit it to one per 
> project (eg: run nova-api, nova-conductor, nova-scheduler, etc all in one 
> container).

My goal is one-service-per-container, because that generally makes the
question of process supervision and log collection a *host* problem
rather than a *container* problem. It also makes it easier to scale an
individual service, if that becomes necessary.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgp5UySJo2RQn.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] kolla, keystone, and endpoints (oh my!)

2014-10-07 Thread Lars Kellogg-Stedman
On Tue, Oct 07, 2014 at 06:14:22PM +1100, Angus Lees wrote:
> What you haven't stated here is whether the catalog endpoints should be 
> reachable outside the kubernetes minions or not.

I thought I had been clear about that, but reading over the email it
looks like the part where I made that clear was actually already iny
my head.

This solution was meant primarily to facilitate pod-to-pod
communication, particularly in cases where a service ends up moving
from one minion to another.

I agree that
https://github.com/GoogleCloudPlatform/kubernetes/issues/1161 needs to
land for a kubernetes-only solution to external access.  Without that,
public urls will need to come through some sort of load balancer
solution or something.  I haven't really thought about that in any
detail at this point.

> Perhaps we could even use this mysterious(*) keystone publicURL/internalURL 
> division to publish different external and kubernetes-only versions, since we 
> can presumably always do more efficient communication pod<->pod.

That is pretty much exactly what I am doing.

> 1.  Fixed hostname
> 
> Add something like this to the start.sh wrapper script:
>  echo $SERVICE_HOST proxy >> /etc/hosts
> and then use http://proxy:$port/... etc as the endpoint in keystone catalog.

This was one of my first thoughts, but according to the
#google_containers folks, SERVICE_HOST is going away real soon now:

  https://github.com/GoogleCloudPlatform/kubernetes/pull/1402

There will be a per-service ip available, so we could still do
something similar.

> Create a regular OpenStack loadbalancer and configure this (possibly publicly 
> available) IP in keystone catalog.
> 
> I _think_ this could even be a loadbalancer controlled by the neutron we just 
> set up, assuming the the loadbalancer HA works without help and the nova<-
> >neutron "bootstrap" layer was setup using regular k8s service env vars and 
> not the loadbalancer IPs.

There's no guarantee that we're running Kubernetes on top of
openstack, and I don't think we could use Neutron deployed inside
kubernetes because we'd want the LB in place for basic services like
keystone and the database server.

> In case it needs to be said, I think we should watch discussions like 
> https://github.com/GoogleCloudPlatform/kubernetes/issues/1161 and try to 
> follow the "standard" kubernetes approaches as they emerge.

Yup, I think that is definitely the way to go.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgpJNhfmzhr7O.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [kolla] kolla, keystone, and endpoints (oh my!)

2014-10-06 Thread Lars Kellogg-Stedman
Hello all,

I wanted to expand on a discussion we had in #tripleo on Friday
regarding Kubernetes and the Keystone service catalog.

## The problem

The essential problem is that Kubernetes and Keystone both provide
service discovery mechanisms, and they are not aware of each other.

Kubernetes provides information about available services through
Docker environment variables, whereas Keystone maintains a list of API
endpoints in a database.

When you configure the keystone endpoints, it is tempting to simply
use the service environment variables provided by Kubernetes, but this
is problematic: the variables point at the host-local kube-proxy
instance.  This is a problem right now because if that host were to
go down, that endpoint would no longer be available.  That will be a
problem in the near future because the proxy address will soon be
pod-local, and thus inaccessible from other pods (even on the same
host).

One could instrument container start scripts to replace endpoints in
keystone when the container boots, but if a service has cached
information from the catalog it may not update in a timely fashion.

## The (?) solution

I spent some time this weekend experimenting with setting up a
pod-local proxy that takes all the service information provided by
Kubernetes and generates an haproxy configuration (and starts haproxy
with that configuration):

- https://github.com/larsks/kolla/tree/larsks/hautoproxy/docker/hautoproxy

This greatly simplifies the configuration of openstack service
containers: in all cases, the "remote address" of another service will
be at http://127.0.0.1/, so you can simply configure that address into
the keystone catalog.

It requires minimal configuration: you simply add the "hautproxy"
container to your pod.

This seems to do the right thing in all situations: if a pod is
rescheduled on another host, the haproxy configuration will pick up
the appropriate service environment variables for that host, and
services inside the pod will contain to use 127.0.0.1 as the "remote"
address.

If you use the .json files from
https://github.com/larsks/kolla/tree/larsks/hautoproxy, you can see
this in action.  Specifically, if you start the services for mariadb,
keystone, and glance, and then start the corresponding ponds, you will
end up with functional keystone and glance services.

Here's a short script that will do just that:

#!/bin/sh

for x in glance/glance-registry-service.json \
glance/glance-api-service.json \
keystone/keystone-public-service.json \
keystone/keystone-admin-service.json \
mariadb/mariadb-service.json; do
  kubecfg -c $x create services
done

for x in mariadb/mariadb.json \
keystone/keystone.json \
glance/glance.json; do
  kubecfg -c $x create pods
done

With this configuration running, you can kill the keystone pod and allow
Kubernetes to reschedule it and glance will continue to operate correctly.

You cannot kill either the glance or mariadb pods because we do not
yet have a solution for persistent storage.

I will be cleaning up these changes and submitting them for review...but
probably not today due to an all-day meeting.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgpF2TOt3UsIt.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [kolla] Diagnosing problems with containers

2014-09-26 Thread Lars Kellogg-Stedman
As people are starting to look at Kubernetes and Docker, there have
been a number of questions regarding how to diagnose problems with
containers.  Here are a few useful hints:

## Docker logs

Docker captures stdout/stderr from the main process and makes this
available via the `docker log` command.  That is, if your Dockerfile
looks like this:

FROM fedora
RUN yum -y install mariadb; yum clean all
CMD mysql --this-is-a-terrible-idea

And you were to have Kubernetes launch an image built from that file
and wanted to diagnose why it wasn't working, you could run `docker
logs ` and see:

mysql: unknown option '--this-is-a-terrible-idea'

## Using 'nsenter'

The `nsenter` command is available in recent coreutils packages.  It
allows you to run commands inside existing namespaces.  

A useful shortcut is to place the following inside a script and call
it "docker-enter":

#!/bin/sh
nsenter -t $(docker inspect \
  --format '{{ .State.Pid }}' $CONTAINER) \
  -m -u -i -n -p -w "$@"

Now you can run `docker-enter ` to start a shell
inside the specified container.

Once inside the container, you may want to see the environment
variables passed to PID 1 to ensure that service discovery is
operating correctly.  You can do that via:

tr '\000' '\012' < /proc/1/environ

You can of course inspect anything on the filesystem, although ideally
your application is logging to stdout/stderr and not to local files.
If you know your container's ENTRYPOINT and CMD entries, you can run
those by hand to see exactly what is happening.

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgpqxMqP1HjV1.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Heat] Should docker plugin remove containers on delete?

2014-09-01 Thread Lars Kellogg-Stedman
Hello all,

I recently submitted this change:

  https://review.openstack.org/#/c/118190/

This causes the Docker plugin to *remove* containers on delete,
rather than simply *stopping* them.  When creating named containers,
the "stop but do not remove" behavior would cause conflicts when try
to re-create the stack.

Do folks have an opinion on which behavior is correct?

-- 
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/



pgp0HRMLHzOb7.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] avahi-autoipd vs. nova networking (cloud-init)

2014-03-31 Thread Lars Kellogg-Stedman
On Sat, Mar 29, 2014 at 11:53:13AM -0400, Mike Spreitzer wrote:
> I run into trouble in Ubuntu VMs when avahi-autoipd is installed. 
> After avahi-autoipd is installed, there is an extra route (number 2 in the 
> [...]
> Of course, avahi-autoipd thinks it is doing me a favor.  Nova thinks it is 
> doing me harm.  Which is right, and how do we achieve harmony?

Why are you installing avahi-autoipd in your cloud instance?  The
autoipd tool is used for configuring network interfaces in the absence
of either a static configuration or a functioning dhcp
environment...and because you're running in a cloud environment,
you're pretty much guaranteed the latter.

If you really want zeroconf networking to be functional inside your
instances while at the same time maintaining access to the OpenStack
metadata service, you could add an explicit route to the metadata
address via your default gateway.  For example, given:

# ip route
default via 10.0.0.1 dev eth0  metric 100 
10.0.0.0/24 dev eth0  proto kernel  scope link  src 10.0.0.4 
169.254.0.0/16 dev eth0  scope link  metric 1000 

I would add:

  ip route add 169.254.169.254 via 10.0.0.1

And this restores access to the metadata service.  This forces the
kernel to pass traffic to 169.254.169.254 to the gateway, rather than
assuming it's accessible via a local network.

-- 
Lars Kellogg-Stedman  | larsks @ irc
Cloud Engineering / OpenStack  | "   "  @ twitter



pgpjmNHTFPbOK.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat] Keystone errors trying to signal a WaitCondition

2014-01-30 Thread Lars Kellogg-Stedman
On Thu, Jan 30, 2014 at 09:46:09PM -0500, Lars Kellogg-Stedman wrote:
> I'm getting an error from Keystone whenever I try to signal a Heat
> WaitCondition...

After poking through the code it turned to be an error in Keystone's
contrib/ec2/controllers.py...which was fixed in 949a2cdc.  Thanks,
shardy.

Cheers,

-- 
Lars Kellogg-Stedman  | larsks @ irc
Cloud Engineering / OpenStack  | "   "  @ twitter



pgpvtJ5DbgUT7.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [heat] Keystone errors trying to signal a WaitCondition

2014-01-30 Thread Lars Kellogg-Stedman
I'm getting an error from Keystone whenever I try to signal a Heat
WaitCondition.  I'm trying to figure out if this is a bug or me simply
not understanding how everything is supposed to communicate.

When trying to signal a WaitConditionHandle,  I see in api-cfn.log
that Heat gets as far as:

DEBUG urllib3.connectionpool [-] "POST /v2.0/ec2tokens HTTP/1.1" 500 183 
_make_request /usr/lib/python2.7/site-packages/urllib3/connectionpool.py:330

At this point, Keystone logs the following:

ERROR keystone.common.wsgi [-] 'unicode' object has no attribute 'get'
TRACE keystone.common.wsgi Traceback (most recent call last):
TRACE keystone.common.wsgi   File 
"/usr/lib/python2.7/site-packages/keystone/common/wsgi.py", line 238, in 
__call__
TRACE keystone.common.wsgi result = method(context, **params)
TRACE keystone.common.wsgi   File 
"/usr/lib/python2.7/site-packages/keystone/contrib/ec2/controllers.py", line 
96, in authenticate
TRACE keystone.common.wsgi creds_ref = 
self._get_credentials(credentials['access'])
TRACE keystone.common.wsgi   File 
"/usr/lib/python2.7/site-packages/keystone/contrib/ec2/controllers.py", line 
237, in _get_credentials
TRACE keystone.common.wsgi return 
self._convert_v3_to_ec2_credential(creds)
TRACE keystone.common.wsgi   File 
"/usr/lib/python2.7/site-packages/keystone/contrib/ec2/controllers.py", line 
222, in _convert_v3_to_ec2_credential
TRACE keystone.common.wsgi 'access': blob.get('access'),
TRACE keystone.common.wsgi AttributeError: 'unicode' object has no 
attribute 'get'
TRACE keystone.common.wsgi 

The HOT template looks (partially) like this:

wait0_handle:
  type: AWS::CloudFormation::WaitConditionHandle
 
wait0:
  type: AWS::CloudFormation::WaitCondition
  properties:
Handle: {get_resource: wait0_handle}
Timeout: 1800
 
instance0:
  type: OS::Nova::Server
  properties:
flavor: {get_param: flavor}
image: {get_param: image}
key_name: { get_param: key_name }
networks:
  - port: { get_resource: instance0_eth0 }
user_data:
  str_replace:
template: |
  #!/bin/sh
  
  cat > /root/wait-url.txt <http://192.168.200.1:8000/v1/waitcondition/arn%3Aopenstack%3Aheat%3A%3A28a490a259974817b88ce490a74df8d2%3Astacks%2Fs0%2F36b013ca-1e46-4340-bf0e-44a609ae6758%2Fresources%2Fwait0_handle?Timestamp=2014-01-31T02%3A09%3A16Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=cd6fbf4ebaed4ea1886ead9f98451f5a&SignatureVersion=2&Signature=9c2bvEYoedkm3uQwVOAcIA5xxy3x9q%2BO1KncY8Eeo%2BQ%3D

I've tried signaling this both using cfn-signal and using the
generated curl commandline directly.  I'm using a recent (sometime
this past week) Heat master, and Keystone 2013.2.1.

-- 
Lars Kellogg-Stedman  | larsks @ irc
Cloud Engineering / OpenStack  | "   "  @ twitter



pgp8V8FiAvkw_.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Unified Guest Agent proposal

2013-12-16 Thread Lars Kellogg-Stedman
On Fri, Dec 13, 2013 at 11:32:01AM -0800, Fox, Kevin M wrote:
> I hadn't thought about that use case, but that does sound like it
> would be a problem.

That, at least, is not much of a problem, because you can block access
to the metadata via a blackhole route or similar after you complete
your initial configuration:

  ip route add blackhole 169.254.169.254 

This prevents access to the metadata unless someone already has root
access on the instance.

-- 
Lars Kellogg-Stedman  | larsks @ irc
Cloud Engineering / OpenStack  | "   "  @ twitter



pgp4mFXCAneZr.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] RFC - Icehouse logging harmonization

2013-10-24 Thread Lars Kellogg-Stedman
On Thu, Oct 24, 2013 at 07:05:19AM -0700, Dan Smith wrote:
> Some of them are not useful to me (but might be to others), like the
> amqp channel lines. However, everything else has been pretty crucial at
> one point or another when debugging issues that span between the two
> tightly-coupled services.

I am completely unfamiliar with the code, so I apologize if these are
dumb questions:

- Is everything making use of Python's logging module?
- Would this be a could use-case for that module's support of
  file-based configuration
  (http://docs.python.org/2/howto/logging.html#configuring-logging)

This would let a cloud deployer have much more granular control
over what log messages show up (and even what log messages go where).
For example, maybe I don't care about messages from
quantum.openstack.common.rpc.impl_qpid, and I generally only want to
log WARN and above, but I want to see DEBUG messages for
quantum.plugins.openvswitch.agent.ovs_quantum_agent.

Or is that Too Much Flexibility?

-- 
Lars Kellogg-Stedman 



pgp4FTOI0Dp6d.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Should packstack configure host network interfaces?

2013-10-10 Thread Lars Kellogg-Stedman
If you deploy OpenStack with Neutron using Packstack and do something
like this...

packstack ...  --neutron-ovs-bridge-interfaces=br-eth1:eth1

...packstack will happily add interface eth1 to br-eth1, but will
neither (a) ensure that it is up right now nor (b) ensure that it is
up after a reboot.  This in contrast to Nova networking, which in
general takes care of bringing up the necessary interfaces at runtime.

Should packstack set up the necessary host configuration to ensure
that the interfaces are up?  Or is this the responsibility of the
local administrator?

Thanks,

-- 
Lars Kellogg-Stedman 



pgpEhsBnmW49t.pgp
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev