Hey Nick,

A particular phrase you used caught my attention: "Elasticsearch holds the 
Hiera config for a number of nodes."

There's a lot about putting together the words "elasticsearch" and "hiera 
backend" that can sound scary if it's done wrong, but I have seen backends 
built to solve the "config for individual nodes" problem in a way that 
complements Hiera's default yaml backend system, without noticeably 
sacrificing performance, by using a carefully limited number of calls to 
the external backend per catalog compile. Most generalized data that 
doesn't need to change frequently or programmatically is still stored in 
yaml files alongside the code.

When that's done, the implementing hiera.yaml file may look something like 
this:

hierarchy:
  - name: 'Per-node data'
    data_hash: elasticsearch_data
    uri: 'http://localhost:9200'
    path: %{trusted.certname}"  
  - name: 'Yaml data'    data_hash: yaml_data    paths:      - 
"role/%{trusted.extensions.pp_role}"      - 
"datacenter/%{trusted.extensions.pp_datacenter}"      - "common"


The most important bit showcased here is that for performance, the 
*data_hash* backend type is used. Hiera can make thousands of lookup calls 
per catalog compile, so something like lookup_key can get expensive over an 
API. data_hash front-loads all the work, returning a batch of data from one 
operation which is then cached and consulted for the numerous lookups 
that'll come from automatic parameter lookup.

There's an example of how to do that 
in https://github.com/uphillian/http_data_hash.

To John's point, I wouldn't hesitate to run your use case by an expert if 
you have the option.

Cheers,
~Reid

On Monday, April 2, 2018 at 7:47:37 AM UTC-7, John Bollinger wrote:
>
>
>
> On Saturday, March 31, 2018 at 5:59:12 AM UTC-5, nick....@countersight.co 
> wrote:
>>
>> Thanks for your response John, 
>>
>> I appreciate you taking a quick look around to see if anyone else has 
>> already done this. I had come to the same conclusion, that if someone has 
>> already, they mostly likely haven't shared it. 
>>
>> You raise valid points about EL being generally pretty unsuitable as a 
>> Hiera backend. However, the project I am working on already has an 
>> Elasticsearch instance running in it, so there would be next to no 
>> performance overhead for me. It uses a web interface to write out YAML 
>> files that are fed into a Hiera for a 'puppet apply' run which configures 
>> various aspects of the system. By using Elastic instead of YAML files, I 
>> can eliminate some of the issues surrounding concurrent access, it also 
>> means backups are simplified, as I'd just need to backup ES.
>>
>
>
> With an ES instance already running, I agree that you have negligible 
> additional *memory* overhead to consider, but that doesn't do anything 
> about *performance* overhead.  Nevertheless, the (speculative) 
> performance impact is not necessarily big; you might well find it entirely 
> tolerable, especially for the kind of usage you describe.  It will depend 
> in part on how, exactly, you implement the details.
>
>
>> Is writing a proof-of-concept Hiera backend something that someone with 
>> reasonable coding skills be able to knock out in a few hours? 
>>
>>
> It depends on what degree of integration you want to achieve.  If you 
> start with the existing YAML back end, and simply hack it to retrieve its 
> target YAML objects from ES instead of from the file system, then yes, I 
> think that could be done in a few hours.  It would mean ES offering up 
> relatively few, relatively large chunks of YAML, which I am supposing would 
> be stored as whole objects in the database.  I think that would meet your 
> concurrency and backup objectives.
>
> If you want a deeper integration, such as having your back end performing 
> individual key lookups in ES, then you might hack up an initial 
> implementation in a few hours, but I would want a lot longer to test it 
> out. I would want someone with detailed knowledge of Hiera and its 
> capabilities to oversee the testing, too, or at least to review it.  Even 
> more so to whatever extent you have in mind to implement Hiera 
> prioritization, merging behavior, interpolations, and / or other operations 
> affecting what data Hiera presents to callers.  If there is an actual 
> budget for this then I believe Puppet, Inc. offers consulting services, or 
> I'm sure you could find a third-party consultant if you prefer.
>
>
> John
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/d35a975e-b9f5-4488-a107-b97007741887%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to