On Fri, Jul 3, 2015 at 10:13 AM, Erik Dalén <erik.gustav.da...@gmail.com>
wrote:

>
>
> On Fri, 3 Jul 2015 at 11:05 Chris Price <ch...@puppetlabs.com> wrote:
>
>> On Fri, Jul 3, 2015 at 9:50 AM, Erik Dalén <erik.gustav.da...@gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, 3 Jul 2015 at 09:25 Chris Price <ch...@puppetlabs.com> wrote:
>>>
>>>>
>>>>
>>>> On Monday, June 29, 2015 at 6:02:17 PM UTC+1, Luke Kanies wrote:
>>>>>
>>>>> On Jun 29, 2015, at 8:43 AM, Raphaël Pinson <raphael...@camptocamp.com>
>>>>> wrote:
>>>>> >
>>>>> > Hello,
>>>>> >
>>>>> >
>>>>> > I've activated caching on our Puppetservers, using the admin API to
>>>>> invalidate the cache upon deploying new environments. However, this only
>>>>> caches manifests, and catalogs still need to be compiled for every 
>>>>> request.
>>>>> >
>>>>> > I'm thinking (at least in our case) it wouldn't be totally crazy to
>>>>> cache catalogs on the master so long as:
>>>>> >
>>>>> > * manifests are not changed (this is taken care of by the r10k hook
>>>>> + admin API)
>>>>> > * data do not change (same, since we deploy hiera data with r10k)
>>>>> > * facts do not change.
>>>>> >
>>>>> >
>>>>> > Obviously, *some* facts always change (uptime, memoryfree, swapfree,
>>>>> etc.), but most of them don't. So the idea would be to add a parameter in
>>>>> puppet.conf with the name of these facts that should be used as a basis 
>>>>> for
>>>>> invalidating the catalog, and use the other facts to decide when a catalog
>>>>> should be recompiled.
>>>>> >
>>>>> > Is there already some kind of code doing that, or any
>>>>> opinion/feedback on this idea?
>>>>>
>>>>> This is something that our team at Puppet Labs has been working on a
>>>>> ton.  It’s beneficial in the short term, for the kind of performance and
>>>>> other benefits you describe, but it’s also key in a bunch of other cool
>>>>> stuff we’re doing.  The short answer is that in some ways it’s quite easy,
>>>>> but it also requires some changes to the core that aren’t necessarily as
>>>>> easy.
>>>>>
>>>>> Eric Sorenson is lead on the work (code-named Direct Puppet), so
>>>>> hopefully he’ll chime in with more details.  The basic idea, though, is
>>>>> that we do a few things, all together (note that this is from memory, and
>>>>> I’m sure I’m missing pieces or getting some of them wrong):
>>>>>
>>>>
>>>> This is indeed something we've been putting a lot of thought and effort
>>>> into lately.
>>>>
>>>> I have a question / thought experiment related to this, and would
>>>> really love to hear some feedback from the community:
>>>>
>>>> What would you think about a setup where your master never saw any of
>>>> your code changes at all, until you ran a specific command (e.g. 'puppet
>>>> deploy')?  In other words, you hack away on the modules / manifests / hiera
>>>> data in your code tree as much as you like but your master keeps compiling
>>>> catalogs from the 'last known good' setup, until you run this 'deploy'
>>>> command?  At that point, all of your current code becomes the new 'last
>>>> known good' and that is what your master compiles off of until you do
>>>> another deploy.
>>>>
>>>
>>> Keeps compiling or keeps serving a cached copy?
>>>
>>
>> Well, both.  :)  In cases where the catalog didn't need to be
>> re-compiled, we wouldn't, but in cases where we do need to do a compile
>> (say, brand new node checks in or something), we'd do the compile based on
>> the 'last known good' code rather than the current contents of the code
>> tree.
>>
>>
>>>
>>>> We could also provide an HTTP endpoint to accomplish the same
>>>> behavior.  And we could theoretically make this new behavior entirely
>>>> opt-in, but, by opting-in to it, you'd get access to new features similar
>>>> to what Raphaël and Luke were hinting at.
>>>>
>>>> Again, this is just a thought experiment at the moment.  Curious how
>>>> this would impact people's workflows.
>>>>
>>>>
>>> Well, it would be useful to be able to atomically switch to a new
>>> version of manifests. At the moment the best you can do is to checkout the
>>> new version somewhere else and move/relink it into place, so you get all of
>>> the new environment at the same time but there might still be ongoing
>>> compiles that get half of the old environment and half of the new.
>>>
>>
>> Yep; atomicity would be one of the major goals.
>>
>>
>>> But it would really have to be per environment (and optionally all of
>>> them).
>>>
>>
>> That makes sense and seems doable.
>>
>> For consistency this would be good. When it comes to speed improvements I
>>> think there's other areas that need more focus. In my experience catalog
>>> application (even with no changes applied) takes about five times longer
>>> than catalog compilation (Puppet 4.2 improved this somewhat though).
>>>
>>
>> Fair!  Thanks for the feedback.
>>
>> Hopefully we've got enough developers now to where we can be working on
>> client and server optimizations in parallel, though, so for this thread I'm
>> most interested in teasing out the feasibility of introducing a 'deploy'
>> step on the server side; it'd give us some atomicity and open the door for
>> a lot of future features and optimizations, but I don't have a great
>> understanding of whether or not it might break some workflows that people
>> rely on today.  Erik, it sounds like in your case, it wouldn't cause you
>> any issues w/rt workflow?
>>
>
> Well, do you have any plans on how to solve queries to external systems
> and updates in them? For example a new node checks in a exported resource
> that some other node should collect, but it already has a cached catalog.
> Require a cache invalidation to be trigger each time you update external
> systems?
> It might be tricky with some external systems to do that, possibly better
> to be able to flag functions that have side effects and always recompile
> catalogs that call on such functions.
>

Yeah, dealing with side-effect inputs to catalogs is a tricky issue.  We're
still batting around ideas on that.


>
> Also will PuppetDB be used as the catalog cache so it would work with
> multiple puppet masters behind a load balancer or SRV records?
>

That conversation is still ongoing as well.  Storing catalogs in PuppetDB
is definitely an option that has been discussed.  In any case, a solid
multi-master story will definitely be considered a pre-requisite to any
final implementation choices.

Putting aside catalog caching for the moment, though... if we added a
mechanism for atomically deploying new code (even if we still did a full
catalog compile on every agent checkin), there are still a lot of other
kinds of optimizations we could build on top of this for Puppet Server
(e.g., it could render the 'environment_timeout' setting irrelevant).  But
before we get too far down the path of mapping out those kinds of
optimizations, we've got to sort out whether or not the introduction of the
extra step to 'deploy' code would cause problems for people.

As I'm typing this I'm realizing that I've kind of hijacked this thread,
since, at least at first, I'm more interested in talking about workflows /
atomic code deployment than about the actual details around caching.  Maybe
I should break this off into a new thread?  Sorry about that!

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/CAMx1QfKa9JtL%2Bdzyfpa_iYxxEo0D3K%2BKVs6H1fugEspmWvO0UA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to