On 2014-22-04 1:09, Jeff Bachtel wrote:
I don't understand how inotify(7)+epoll(7) watching all environment
subdirectories for changes (CONFCHANGE, but pointing at all directories
vice files) would not be more performant than SCAN.
As far as I know, inotify and epoll are not available on all platforms
that Puppet supports, and we currently have no plans to add such support
to those platforms.
Also, while it probably is more performant than individualy watching
every file, it is not without performance penalty. There is also
substantial complexity in using inotify - see the "Limitations and
Caveats" in the documentation which states:
"Limitations and caveats
Inotify monitoring of directories is not recursive: to monitor
subdirectories under a directory, additional watches must be created.
This can take a significant amount time for large directory trees.
The inotify API provides no information about the user or process that
triggered the inotify event. In particular, there is no easy way for a
process that is monitoring events via inotify to distinguish events that
it triggers itself from those that are triggered by other processes.
Note that the event queue can overflow. In this case, events are lost.
Robust applications should handle the possibility of lost events gracefully.
The inotify API identifies affected files by filename. However, by the
time an application processes an inotify event, the filename may already
have been deleted or renamed.
If monitoring an entire directory subtree, and a new subdirectory is
created in that tree, be aware that by the time you create a watch for
the new subdirectory, new files may already have been created in the
subdirectory. Therefore, you may want to scan the contents of the
subdirectory immediately after adding the watch."
Puppet is pretty much exposed to all of those caveats - and the only
remedy is to... scan. Which is the very thing we want to avoid.
The primary problem is that there are no transaction boundaries around
changes in the file system. Puppet tries to run on top of a potentially
changing set of files - in fact bad things can happen when changing
files while puppet is compiling a catalog that depends on the changing
files. We do plan to address those issues at some point during Puppet
4x. That is, such problems prevail even if changes are computed in a
more efficient way (say using inotify) due to the asynchronous nature
of watching / notifying).
Meanwhile, we do believe that the three proposed strategies NONE,
REBOOT, and TIMEOUT are sufficient to handle the usage scenarios we have
identified, and that they will work with as much safety as the current
implementation, and that they will be more accurate and easier to
understand, and most importantly, that they do not have to be based on
scanning anything.
I hope that explains our reasoning.
Regards
- henrik
Jeff
On 04/21/2014 05:29 PM, Henrik Lindberg wrote:
Hi,
We have been looking into environment caching and have some thoughts
and ideas about how this can be done. Love to get your input on those
ideas, and your thoughts about their usefulness.
There is a google document that has the long story - it is open for
commenting. It is not required reading as the essence is outlined here.
The doc is here:
https://docs.google.com/a/puppetlabs.com/document/d/1G-4Z6vi6Tv5xZtzVh7aT2zNWbOxJ3BGfJu31pAHxS7g/edit?disco=AAAAAGtMYOI#heading=h.rpgaxghcfqol
The current state of caching environments
---
A legacy environment caches the result or parsing manifests and
loading functions / types, and reacts to changed files. It does this
by recording the mtime of each file as it is parsed / read. Later, if
the same file would be parsed again, it will use the already cached
produced result. If the file is stale, the entire cache is cleared and
it starts from scratch.
It does not however react to added files. It also does not recognize
changes in files evaluated as a consequence of evaluating ruby logic
(i.e. if a function, type, etc. required something, that is not
recorded).
The new directory based environments does not support caching. (And
now we want to address this).
The problem with caching
---
The problem with caching is that it can be quite costly to compute and
we found that different scenarios benefits from different caching
strategies.
In an environment where the ratio of modules/manifests present in the
environment vs. the number actually used per individual node is low
checking caching can be slower than starting with a clean slate every
time.
Proposed Strategies
---
We think there is a core set of strategies that a user should be able
to select. These should cover the typical usage scenarios.
* NONE - no caching, each catalog product starts with a clean slate.
This is the current state of directory based environments, and it
could also be made to apply to legacy environments. This is good in
a very dynamic environment / development or low "signal/noise" ratio.
* REBOOT - (the opposite of NONE) - cache everything, never check for
changes. A reboot of the master is required for it to react to
changes.
This is good for a static configuration, and where the organization
always takes down the master for other reasons when there are changes.
This strategy avoids scanning, and is thus a speed improvement for
configurations with a large set of files.
* TIMEOUT - cache all environments with a 'time to live' (TTL). When a
request is made for an environment where the TTL has expired it
starts that environment with a clean slate.
This is a compromise - it will pick up all changes (even additions),
but it will take one "TTL" before they are picked up (say 5 minutes;
configurable).
These three schemes are believed to cover the different usage
scenarios. They all have the benefit that they do not require watching
any files (thereby drastically reducing the number of stat calls).
Strategy that is probably not needed:
* ENVDIRCHANGE - watches the directory that represents
the environment. Reloads if the directory itself is stale (using
filetimeout setting to cap the number of times it checks). Thus, it
will reaact to changes to the environment root only (which typically
does not happen when changing content in the environment, but is
triggered if the environments configuration file is added or removed).
To pick up any other changes, the user would need to touch the
directory.
Strategies we think are not needed:
* SCAN - like today where every file is watched.
* CONFCHANGE - watch/scan all configuration files.
Feedback ?
---
Here are a couple of questions to start with...
* What do you think of the proposed strategies?
* If you like the scanning strategy, what use cases do you see it
would benefit that the proposed strategies does not handle?
* Any other ideas?
* Any use cases you feel strongly about? Scenarios we need to consider...
Regards
- henrik
--
You received this message because you are subscribed to the Google Groups "Puppet
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-dev/lj4ika%24o5m%241%40ger.gmane.org.
For more options, visit https://groups.google.com/d/optout.