[ansible-project] How to optimise many hosts which are all localhost?

Matthew Davis Tue, 21 Apr 2020 20:32:25 -0700

I have a use case which is a bit different to most, but Ansible seems to do 
a pretty good job.
I'm trying to figure out the best way to leverage Ansible's inherent 
concurrency for multiple hosts, when deploying to serverless 
infrastructure, which has no hosts. (Instead just using tasks which do API 
calls on localhost.)

# The Task

I'm building, testing and deploying a project using Ansible.

My target infrastructure is all serverless. I'm deploying code to AWS
Lambda functions, with CloudFormation. So there are no 'hosts' for Ansible
to connect to.

I have something currently which works, but is slow and a bit messy.

I've got a big playbook, all connecting to one host, which is 'localhost'
(connection=local).
I've got a yaml file with a list of my lambda functions and relevant config
(e.g. timeout, name, comment, environment variables etc).

I've got one role for all my Lambda function stuff.

1. Use `include_vars` to load in aforementioned variable file
2. using `with_dict` and the variable from that file, `pip install` the
modules each one depends on,
3. copy over the code for each Lambda function to where those
dependencies where installed, also with_dict
4. run a unit test for each (shell: python main.py), also with_dict
5. zip up each folder, also with_dict
6. upload each zip to S3, using with_nested (for each lambda function,
for each region)

In particular the uploading is slow, because it's in series. I'd like to do
it in parallel.

Then I have other roles for other things which are not per-lambda. (e.g.
deploying cloudformation templates, configuring a firewall etc.)

# The problem

Inside that role for Lambda stuff, almost every task has with_dict or
with_items, or sometimes with_nested (e.g. upload each lambda zip for each
of multiple regions), in addition to extensive `when` conditions. (I made
it so I can pass in `-e only_lambda=x` to Ansible, to skip tasks for
Lambdas I haven't changed.)
This ends up quite messy.

Overall the whole thing is quite slow too, because it does everything in
series not parallel. Some things are I/O bound, and could be sped up a lot
by doing them concurrently.
e.g. I used `async` to do the upload in parallel.

Another thing that get's messy is that I want to keep that variable file
with lambda config minimal. I don't want to copy-paste boiler plate for
each one. (It's already too huge.)
So each lambda has a field called 'Name'.
I want to add another field like: `local_zip_name: "{{ build_dir }}/{{
self.cf_name }}.zip".
Modifying a dict in Ansible is surprisingly messy. Currently I do this
using `set_fact`.
Although it turns out that if you load the variable from a file with `-e
@file.yaml`, it seems that changes with `set_fact` don't take effect.
So I have to use an include_vars task instead or loading a var file with
command line arguments.

# Attempted solution

I'm trying to add each 'lambda function' to static inventory.
Each one being localhost, but having unique host variables equal to what I
had in that yaml file.
This way I can leverage all the work Ansible devs have already done to do
things concurrently.

It turns out you can omit the IP address, and it defaults to localhost
anyway, and Ansible will happily work with multiple hosts that are actually
the exact same host.
It all just works. Yay!
I can get rid of all the `with_items`, `with_dict` etc, and just run the
role against the group of hosts.
And I can reduce the number of tasks by moving set_fact definitions into
host definitions.

Then for other tasks which I only need to do once, I run those roles with a
different host group, that's one Ansible host connecting to localhost.

This host approach also makes it nice to set things like `local_zip_fname`
(mentioned above). I can set that variable once for the whole group, and
lazy variable evaluation means it's evaluated differently for each host.
This means I don't have to worry about dependency issues about which
variables are defined first, and there's one place for all the default
variables.

But, there are 2 issues.

1. I want to reference that `local_zip_fname` variable (defined using
Jinja2 templating for each Lambda host) from the other host (one host,
localhost). But `local_zip_fname` is not defined for that other host. So I
tried the thing where you dig into `host_vars['other_host']['my_var']` to
get it. But now lazy variable evaluation bites back. The variable I want
(`local_zip_fname`) includes Jinja2 templating for another variable
(cf_name), which was present on those other hosts (one per Lambda), but not
this host.

Question: How can I access variables for other hosts, forcing Jinja2
evaluation based on that other host's environment?
The only solution I can think of is to save the variables to disk for each
of those other hosts (using | to_yaml to force Jinja evaluation), and then
loading that file from the host I want to read the variables. Is there a
better way?

The other issue is that the speed improvement is no where near as good as I
hoped. If I deploy from a beefy machine with 8 CPUs instead of my usual 2,
it seems to speed up the relevant tasks by a factor of 2. (Not by a factor
of 8/2). If I try to deploy with my usual 2 CPU, there's no noticeable
difference in speed.
It seems that there is a lot of overhead per host, which is taking away
from the speed increase from converting sequential `with_items` to
concurrent hosts.

I've tried to optimise things with:
gathering=no
or using fact caching,
and I've already set forks=50 (More than the number of hosts).
And I enabled pipelining (which probably does nothing for connection=local)

Is there any other way to optimise this? Is there a way to tell Ansible
that these 30 hosts are actually all the same machine, so it only needs to
do some things once not 30 times?

Or is there some other approach to solve my problem? Is this the x y problem
<http://xyproblem.info/>?

Thanks,
Matt

(GH: matt-telstra and mlda065)

--
You received this message because you are subscribed to the Google Groups
"Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to ansible-project+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/ansible-project/2a2e3ce6-a540-450f-949f-69eff16ee190%40googlegroups.com.

[ansible-project] How to optimise many hosts which are all localhost?

Reply via email to