Re: [Puppet-dev] PUP-3116 AKA Global Queues

Trevor Vaughan Thu, 18 Dec 2014 07:08:30 -0800

+1 to all of this. It all makes sense and I think it'll solve all of the
use cases that I can think of.


On Wed, Dec 17, 2014 at 8:34 PM, Michael Smith <michael.sm...@puppetlabs.com
> wrote:
>
> Ok, after some discussions with Josh and Andy (Andy's below), came up with
> a proposal for how one might write a stash for re-using data. Just for
> clarification, in what sense do you mean a 'queueing' mechanism?
>
> Create a Stash class of some sort, probably in Puppet::Util, that's a
> simple key/value store. That class can be instantiated in specific
> resources where it's needed, assuming the resource is a class with a
> sufficiently long lifetime. We can also instantiate a global stash, which
> is created in lib/puppet/configurer.rb as part of push_context when we're
> setting up a run. The Stash class could have a static member that's queried
> to get the global version in push_context (if it's available); the parsed
> data from /proc/mounts can be added to the context instance of the Stash.
>
> Andy and my discussion on #puppet-dev today:
>
>> [16:43:15] *<MichaelSmith>* *+zaphod42*: There's a mailing list thread
>> on PUP-3116 that tries to cache the result of reading /prod/mounts
>> [16:44:06] *<MichaelSmith>* I'm trying to explore whether there are any
>> existing patterns for caching data we re-use during a catalog run.
>> [16:45:05] *<MichaelSmith>* Puppet::Util::Storage kind of covers that,
>> with the added benefit of logging the cached data, but also the cost of
>> writing to PuppetDB.
>> [16:46:02] *<MichaelSmith>* And also doesn't work with puppet apply, so
>> that's problematic.
>> [16:46:51] *<+zaphod42>* Puppet::Util::Storage writes to puppetdb? I
>> thought it just wrote to a local file
>> [16:47:40] *<+zaphod42>* I think henrik's concern about memory leaks
>> really just is about the problems we encounter when the cache is never
>> flushed
>> [16:47:58] *<+zaphod42>* the data really just needs to have a clear
>> lifetime
>> [16:48:09] *<MichaelSmith>* Oh, I may be confused about
>> Puppet::Util::Storage then.
>> [16:48:31] *<+zaphod42>* and based on what I'm seeing, is this really a
>> cache? or is it really just about having some "stash" where providers can
>> store data during a run?
>> [16:49:28] *<MichaelSmith>* It would potentially be refreshed if the
>> /proc/mounts gets updated, but that's up to the provider. So just a stash
>> makes sense.
>> [16:49:37] *<+zaphod42>* MichaelSmith: yeah, Storage just writes to a
>> local file
>> https://github.com/puppetlabs/puppet/blob/master/lib/puppet/util/storage.rb#L86
>> [16:50:36] *<MichaelSmith>* Is using Storage to stash data used during a
>> run something that's been discouraged in the past?
>> [16:50:44] *<+zaphod42>* MichaelSmith: in which case, I would think
>> about it as providing a "stash" method for providers. A very simple thing
>> would be it just returns a hash that can be manipulated by the provider
>> [16:50:55] *<+zaphod42>* the hash needs to be stored somewhere
>> [16:51:15] *<+zaphod42>* that can be handled by the Transaction and it
>> can just throw all of the contents away at the end of a run
>> [16:51:54] *<MichaelSmith>* Yeah, sounds like a reasonable API to write.
>> Puppet::Util::Stash, that's cleared after a run and only stored in-memory.
>> [16:51:57] *<+zaphod42>* there is also the question about what is the
>> scope of the data. Does just one resource get to see its own data, is it
>> shared across all resources of the same provider, all of the same type, or
>> all of the same run
>> [16:52:45] *<MichaelSmith>* Do you have ideas how to enforce those types
>> of restrictions?
>> [16:53:43] *<+zaphod42>* Have different stashes for each set? So for
>> every resource it has its own stash, the type has a stash, and the
>> transaction has a stash and they are all accessed independently
>> [16:54:14] *<+zaphod42>* the biggest problem is threading it through the
>> APIs. Ideally they would be something that fits in nicely, but I have a
>> feeling it will just be another global somewhere
>> [16:54:52] *<MichaelSmith>* I think the tricky part becomes how to clear
>> them when we have many isolated stashes.
>> [16:54:59] *<MichaelSmith>* So they have to register themselves globally
>> somewhere.
>> [16:56:05] *<+zaphod42>* or they live as instance variables on some
>> objects that get thrown away
>> [16:56:18] *<+zaphod42>* so the resource stash is just an instance
>> variable on a resource
>> [16:56:26] *<+zaphod42>* provider stash is on a provider
>> [16:56:41] *<+zaphod42>* (there is a problem there that every resource
>> is an instance of a provider)
>> [16:56:52] *<+zaphod42>* there isn't a shared provider instance across
>> the resources
>> [16:58:13] *<+zaphod42>* so one way to do it is have a Stashs object
>> that is pushed into the context by the transaction and popped when the
>> transaction is done
>> [16:58:32] *<MichaelSmith>* This particular example is being used in a
>> type, and I don't yet see where it creates a persistent instance object.
>> The lifetime might be too short to be useful.
>> [16:58:39] *<+zaphod42>* the stashes object holds all of the stashes for
>> all of the resources, types, etc (whatever scopes are deemed correct)
>> [16:59:18] *<+zaphod42>* in a type....Types are tricky because they are
>> shared between the master and the agent
>> [17:01:44] *<MichaelSmith>* I'm not quite sure of the implications of
>> that. I guess that means lifetime on the master is different.
>> [17:05:37] *<+zaphod42>* yeah, how types are used on the master versus
>> the agent is different. I can't ever remember all of the details though
>> [17:06:40] *<+zaphod42>* but if you put all of the stashes in a Stashes
>> instance and put that instance in the Context and then use context_push (or
>> better context_override), then it should be fine and not have a memory leak
>> [17:07:15] *<+zaphod42>* however, it will end up holding onto data
>> during a transaction longer than it may need to, thus increasing memory
>> usage
>> [17:07:23] *<+zaphod42>* but I'm not sure how much of a problem that
>> would be
>> [17:07:37] *<+zaphod42>* so long as there is some point at which the
>> objects will be cleaned up
>> [17:08:01] *<MichaelSmith>* Is there any advantage of having a Stashes
>> instance that's added via push_context, vs just pushing your hash directly
>> to it?
>> [17:08:22] *<MichaelSmith>* I guess the ability to add arbitrary keys
>> after starting.
>> [17:08:44] *<+zaphod42>* push_context would  just be where some
>> collection of stashes would be held and other things can get to (a global,
>> but with more control)
>> [17:09:12] *<+zaphod42>* you should still provide an API on the
>> resources to get to the stashes, instead of having authors go directly to
>> Puppet.lookup
>> [17:09:29] *<MichaelSmith>* Yeah, makes sense.
>> [17:09:55] *<+zaphod42>* and the other part of the context is that it
>> controls the lifetime of the stashes
>> [17:10:16] *<+zaphod42>* once the context is popped, the stashes
>> disappear
>> [17:10:51] *<+zaphod42>* I'd much rather have instances of resources and
>> such hold onto their own stashes, but it might be difficult
>> [17:11:28] *<+zaphod42>* however, I think you should look into that.
>> Only use the context system if there isn't a more local way of controlling
>> it
>> [17:11:33] *<MichaelSmith>* Yeah... not everything seems to have an
>> instance.
>> [17:12:13] *<+zaphod42>* which is the sad making part :(
>
>
> On Wed, Dec 17, 2014 at 3:53 PM, Michael Smith <
> michael.sm...@puppetlabs.com> wrote:
>>
>> I'm doing my own digging to figure out what seems to make sense.
>>
>> Josh had mentioned Puppet::push_context, set in the configurer. We push
>> and pop context for each apply run; however that's a private API that
>> doesn't seem to be meant for general use. Piggybacking on it looks like it
>> would get messy.
>>
>> There's also Puppet::Util::Storage, which superficially looks appropriate
>> for this kind of caching (
>> http://www.rubydoc.info/gems/puppet/Puppet/Util/Storage). I'm still
>> trying to wrap my head around what side-effects might occur.
>>
>>
>> On Tue, Dec 16, 2014 at 6:27 PM, Trevor Vaughan <tvaug...@onyxpoint.com>
>> wrote:
>>>
>>> Part of my other heartburn with using a file was revisited hard upon me
>>> as I recalled the original extdata function implementation.
>>>
>>> In the case of extdata, one large extdata file + a lot of extlookups =
>>> massive catalog compile times on the server.
>>>
>>> So, every time I want to call the cache, across potentially large
>>> numbers of providers and/or other things requiring state, I *really* don't
>>> want to read a file. Particularly, when I don't know what's going to be in
>>> it.
>>>
>>> In this case, we would have to contend with slower client run times and
>>> more CPU overhead as well as disk I/O requirements. Indicating that people
>>> should change the way their OS is configured inasmuch as using tmpfs when
>>> they may not have this choice does not seem ideal unless, of course, it
>>> ships with puppet and doesn't require a system reboot. If, for some reason,
>>> I have 50 providers that want to use this, this is 50 file reads and writes
>>> that could be avoided.
>>>
>>> Giving people the choice of Disk vice Memory overhead would be ideal if
>>> you want both for some reason.
>>>
>>> I'm honestly not seeing what would be so bad about scope.cache where
>>> cache is some top level Puppet::Cache object that holds hashes that expire
>>> at the end of a run. You would have to do things very politely in terms of
>>> namespacing but you have to do that anyway.
>>>
>>> I am, of course, not opposed to saving cache state to disk for debugging
>>> purposes, and think that should be an option when the --debug flag is used.
>>>
>>> Trevor
>>>
>>> Trevor
>>>
>>> On Tue, Dec 16, 2014 at 7:37 PM, Felix Frank <
>>> felix.fr...@alumni.tu-berlin.de> wrote:
>>>>
>>>>  Hey,
>>>>
>>>> good points - state retention at whatever granular level would be a
>>>> good general purpose tool to have. If it's built in a pervasive fashion
>>>> (i.e., any provider might use the cache for whetever it deems appropriate),
>>>> it gains added visibility and becomes more opaque to the user - which is a
>>>> good thing, and addresses one of the major concerns I'm having with this.
>>>> The other being that it needs to be tunable for the user in some fashion.
>>>>
>>>> I have no qualms about disk I/O - after all, the user can choose
>>>> whatever block backend they want. Users who depend on low latency or need
>>>> to save IOPS can employ a tmpfs, for example.
>>>>
>>>> Cheers,
>>>> Felix
>>>>
>>>> On 12/17/2014 12:56 AM, Trevor Vaughan wrote:
>>>>
>>>>  I'm happy with catalog lifetime.
>>>>
>>>> I'm really not happy with doing anything that involves disk I/O.
>>>>
>>>>  This would be key to getting providers to be able to save state in a
>>>> non-hacky way as well.
>>>>
>>>> Trevor
>>>>
>>>> On Tue, Dec 16, 2014 at 6:45 PM, Michael Smith <
>>>> michael.sm...@puppetlabs.com> wrote:
>>>>>
>>>>> I don't like any of the ideas I raised, but this will take some
>>>>> digging. We need to determine what life-time the cache should have, and
>>>>> what interface. I'm leaning towards either a cached read API in the
>>>>> FileSystem utilities, or a cache tied to the catalog lifetime.
>>>>>
>>>>
>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Puppet Developers" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to puppet-dev+unsubscr...@googlegroups.com.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/puppet-dev/5490D048.7020702%40Alumni.TU-Berlin.de
>>>> <https://groups.google.com/d/msgid/puppet-dev/5490D048.7020702%40Alumni.TU-Berlin.de?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>> --
>>> Trevor Vaughan
>>> Vice President, Onyx Point, Inc
>>> (410) 541-6699
>>> tvaug...@onyxpoint.com
>>>
>>> -- This account not approved for unencrypted proprietary information --
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Puppet Developers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to puppet-dev+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoUCo4FmT9QGk_P1kYg0CdEWA9pqhU%3D6jeXjBAr9z7fD9w%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoUCo4FmT9QGk_P1kYg0CdEWA9pqhU%3D6jeXjBAr9z7fD9w%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "Puppet Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to puppet-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/puppet-dev/CABy1mMJigXCzOi1P1wD4G8kb6Ec3gS3y%2Bw_aANpkdu5s2gOWkw%40mail.gmail.com
> <https://groups.google.com/d/msgid/puppet-dev/CABy1mMJigXCzOi1P1wD4G8kb6Ec3gS3y%2Bw_aANpkdu5s2gOWkw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>


-- 
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699
tvaug...@onyxpoint.com

-- This account not approved for unencrypted proprietary information --

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoV9hwQFm8GO7Oxt8VjpDu%2BxDS24z4nSj1LPDo4hkmDTcA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Puppet-dev] PUP-3116 AKA Global Queues

Reply via email to