[Puppet Users] Re: [Puppet-dev] Draft for new type and provider API

Trevor Vaughan Thu, 02 Feb 2017 07:39:06 -0800

I think I'm slowly talking myself into this.

The issue that I've had is that the current Data Types are composable.
There's a ticket open about it and hopefully that will fix a lot of my
issues.


Unfortunately, it looks like my heap/stack discussion was with Henrik on
Slack, so that's not super helpful.

Basically, the idea is that, instead of using a pure object model, objects
would be created *as necessary* and the linear execution model would be
kept in a simple array (stack). The data space (heap) would contain the
attributes that are required for each object to execute.

This means that the data structure would be extremely efficient and there
would be almost no cost to building and manipulating an object graph.

The benefit of the object graph is to keep the object model. However, the
object model is simply not necessary and has proven not to scale to 10k+
resources. The Heap/Stack model should scale based on the amount of data
that you have and would be *extremely* portable to other languages.

The biggest down side (for me) is that I manipulate the catalog graph at
run time for various reasons. There would probably be an easy way to work
this into the API though since it would just be an array insertion.

In terms of the validate/munge question, munging can *absolutely* be done
client side. But do you even want to pass a catalog to the client if you
have invalid data that might be used by other resources. Also, I do quite a
bit of checking based on the values of other parameters/properties. If this
could be maintained reasonably, that would be fine.

I do think that having a global data space would be extremely important to
making this feasible. There are a couple of tickets on for this as well but
I don't know the numbers off hand. And, of course, if we use the heap/stack
model, the data is all in place (but may change based on resource
processing).

Hopefully this all makes sense.

Thanks,

Trevor

On Thu, Feb 2, 2017 at 8:42 AM, David Schmitt <david.schm...@puppet.com>
wrote:

>
>
> On 2 February 2017 at 02:50, Trevor Vaughan <tvaug...@onyxpoint.com>
> wrote:
>
>> Hi David,
>>
>> Most of this looks fine (I still need to digest some of it) but I don't
>> know that I'm a huge fan of no munge or validate functionality. This seems
>> to fly in the face of a 'fail fast' mentality.
>>
>
> With puppet 4 data types much of the trivial stuff munge and validate do
> can be handled much easier and consistently than before. Many other checks
> will never be possible on the server, because they depend on values only
> available on the agent (credential validity, max tablename length dependent
> on mysql flavour), or would be error prone duplication of things the target
> API does check for us (validity of file mode bit combinations, validity of
> AMI references when talking to EC2 instances, etc). There is a bit of stuff
> in-between those two (e.g. content&source specified on a file), which we'd
> trade-off for a much simpler server side. Failing on the agent, also means
> that independent parts of the catalog can move forward, instead of failing
> the whole compilation.
>
>
>> Also, if everything happens on the client, some of the catalog munging
>> that I do on the fly isn't going to work any longer.
>>
>
> Those resources will not look anything different than existing types
> (which will stay as they are for a considerable while). Since I don't know
> what catalog munging you're doing I'm having a hard time following here.
>
> I *do* like the idea of types being pure data but will this bloat the
>> catalog even further?
>>
>
> The resources in the catalog won't look any different than today. In fact,
> the first version will most likely just call Puppet::Type.newtype() under
> the hood somewhere.
>
>
>> I still think that moving toward a heap/stack model would serve the
>> Puppet clients best and get past the resource count limitations.
>>
>
> Do you have those ideas written up somewhere?
>
>
>> Fundamentally, I want my clients to do as little work as possible since
>> they're meant to be running business processes.
>>
>
> Same.
>
>
>>
>> I'll definitely be keeping an eye on this conversation though.
>>
>> Thanks,
>>
>
> Thanks for your time and work you're putting in here.
>
>
> David
>
>
>>
>> Trevor
>>
>> On Tue, Jan 31, 2017 at 11:04 AM, David Schmitt <david.schm...@puppet.com
>> > wrote:
>>
>>> Hi *,
>>>
>>> The type and provider API has been the bane of my existence since I
>>> [started writing native resources](https://github.com/
>>> DavidS/puppet-mysql-old/commit/d33c7aa10e3a4bd9e97e947c471ee3ed36e9d1e2).
>>> Now, finally, we'll do something about it. I'm currently working on
>>> designing a nicer API for types and providers. My primary goals are to
>>> provide a smooth and simple ruby developer experience for both scripters
>>> and coders. Secondary goals were to eliminate server side code, and make
>>> puppet 4 data types available. Currently this is completely aspirational
>>> (i.e. no real code has been written), but early private feedback was
>>> encouraging.
>>>
>>> To showcase my vision, this [gist](https://gist.github.com
>>> /DavidS/430330ae43ba4b51fe34bd27ddbe4bc7) has the [apt_key type](
>>> https://github.com/puppetlabs/puppetlabs-apt/blob/mast
>>> er/lib/puppet/type/apt_key.rb) and [provider](https://github.com/
>>> puppetlabs/puppetlabs-apt/blob/master/lib/puppet/provider/ap
>>> t_key/apt_key.rb) ported over to my proposal. The second example there
>>> is a more long-term teaser on what would become possible with such an API.
>>>
>>> The new API, like the existing, has two parts: the implementation that
>>> interacts with the actual resources, a.k.a. the provider, and information
>>> about what the implementation is all about. Due to the different usage
>>> patterns of the two parts, they need to be passed to puppet in two
>>> different calls:
>>>
>>> The `Puppet::SimpleResource.implement()` call receives the
>>> `current_state = get()` and `set(current_state, target_state, noop)`
>>> methods. `get` returns a list of discovered resources, while `set` takes
>>> the target state and enforces those goals on the subject. There is only a
>>> single (ruby) object throughout an agent run, that can easily do caching
>>> and what ever else is required for a good functioning of the provider. The
>>> state descriptions passed around are simple lists of key/value hashes
>>> describing resources. This will allow the implementation wide latitude in
>>> how to organise itself for simplicity and efficiency.
>>>
>>> The `Puppet::SimpleResource.define()` call provides a data-only
>>> description of the Type. This is all that is needed on the server side to
>>> compile a manifest. Thanks to puppet 4 data type checking, this will
>>> already be much more strict (with less effort) than possible with the
>>> current APIs, while providing more automatically readable documentation
>>> about the meaning of the attributes.
>>>
>>>
>>> Details in no particular order:
>>>
>>> * All of this should fit on any unmodified puppet4 installation. It is
>>> completely additive and optional. Currently.
>>>
>>> * The Type definition
>>>   * It is data-only.
>>>   * Refers to puppet data types.
>>>   * No code runs on the server.
>>>   * This information can be re-used in all tooling around
>>> displaying/working with types (e.g. puppet-strings, console, ENC, etc.).
>>>   * autorelations are restricted to unmodified attribute values and
>>> constant values.
>>>   * No more `validate` or `munge`! For the edge cases not covered by
>>> data types, runtime checking can happen in the implementation on the agent.
>>> There it can use local system state (e.g. different mysql versions have
>>> different max table length constraints), and it will only fail the part of
>>> the resource tree, that is dependent on this error. There is already ample
>>> precedent for runtime validation, as most remote resources do not try to
>>> replicate the validation their target is already doing anyways.
>>>   * It maps 1:1 to the capabilities of PCore, and is similar to the
>>> libral interface description (see [libral#1](https://github.com/
>>> puppetlabs/libral/pull/2)). This ensures future interoperability
>>> between the different parts of the ecosystem.
>>>   * Related types can share common attributes by sharing/merging the
>>> attribute hashes.
>>>   * `defaults`, `read_only`, and similar data about attributes in the
>>> definition are mostly aesthetic at the current point in time, but will make
>>> for better documentation, and allow more intelligence built on top of this
>>> later.
>>>
>>> * The implementation are two simple functions `current_state = get()`,
>>> and `set(current_state, target_state, noop)`.
>>>   * `get` on its own is already useful for many things, like puppet
>>> resource.
>>>   * `set` receives the current state from `get`. While this is necessary
>>> for proper operation, there is a certain race condition there, if the
>>> system state changes between the calls. This is no different than what
>>> current implementations face, and they are well-equipped to deal with this.
>>>   * `set` is called with a list of resources, and can do batching if it
>>> is beneficial. This is not yet supported by the agent.
>>>   * the `current_state` and `target_state` values are lists of simple
>>> data structures built up of primitives like strings, numbers, hashes and
>>> arrays. They match the schema defined in the type.
>>>   * Calling `r.set(r.get, r.get)` would ensure the current state. This
>>> should run without any changes, proving the idempotency of the
>>> implementation.
>>>   * The ruby instance hosting the `get` and `set` functions is only
>>> alive for the duration of an agent transaction. An implementation can
>>> provide a `initialize` method to read credentials from the system, and
>>> setup other things as required. The single instance is used for all
>>> instances of the resource.
>>>   * There is no direct dependency on puppet core libraries in the
>>> implementation.
>>>     * While implementations can use utility functions, they are
>>> completely optional.
>>>     * The dependencies on the `logger`, `commands`, and similar
>>> utilities can be supplied by a small utility library (TBD).
>>>
>>> * Having a well-defined small API makes remoting, stacking, proxying,
>>> batching, interactive use, and other shenanigans possible, which will make
>>> for a interesting time ahead.
>>>
>>> * The logging of updates to the transaction is only a sketch. See the
>>> usage of `logger` throughout the example. I've tried different styles for
>>> fit.
>>>   * the `logger` is the primary way of reporting back information to the
>>> log, and the report.
>>>   * results can be streamed for immediate feedback
>>>   * block-based constructs allow detailed logging with little code
>>> ("Started X", "X: Doing Something", "X: Success|Failure", with one or two
>>> calls, and only one reference to X)
>>>
>>> * Obviously this is not sufficient to cover everything existing types
>>> and providers are able to do. For the first iteration we are choosing
>>> simplicity over functionality.
>>>   * Generating more resource instances for the catalog during
>>> compilation (e.g. file#recurse or concat) becomes impossible with a pure
>>> data-driven Type. There is still space in the API to add server-side code.
>>>   * Some resources (e.g. file, ssh_authorized_keys, concat) cannot or
>>> should not be prefetched. While it might not be convenient, a provider
>>> could always return nothing on the `get()` and do a more customized enforce
>>> motion in the `set()`.
>>>   * With current puppet versions, only "native" data types will be
>>> supported, as type aliases do not get pluginsynced. Yet.
>>>   * With current puppet versions, `puppet resource` can't load the data
>>> types, and therefore will not be able to take full advantage of this. Yet.
>>>
>>> * There is some "convenient" infrastructure (e.g. parsedfile) that needs
>>> porting over to this model.
>>>
>>> * Testing becomes possible on a completely new level. The test library
>>> can know how data is transformed outside the API, and - using the shape of
>>> the type - start generating test cases, and checking the actions of the
>>> implementation. This will require developer help to isolate the
>>> implementation from real systems, but it should go a long way towards
>>> reducing the tedium in writing tests.
>>>
>>>
>>> What do you think about this?
>>>
>>>
>>> Cheers, David
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Puppet Developers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to puppet-dev+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/puppet-dev/CALF7fHaJdvPrkqRQEMqEgLSUvOy-O4DuL-iNsLrPt74H
>>> Y7djvw%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/puppet-dev/CALF7fHaJdvPrkqRQEMqEgLSUvOy-O4DuL-iNsLrPt74HY7djvw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Trevor Vaughan
>> Vice President, Onyx Point, Inc
>> (410) 541-6699 x788 <%28410%29%20541-6699>
>>
>> -- This account not approved for unencrypted proprietary information --
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Puppet Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to puppet-dev+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/puppet-dev/CANs%2BFoWyRAW0uuAQwiEOj1ja3F-WgU6auKsgMKGWVT
>> CmwdeKYQ%40mail.gmail.com
>> <https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoWyRAW0uuAQwiEOj1ja3F-WgU6auKsgMKGWVTCmwdeKYQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to puppet-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/puppet-dev/CALF7fHZ0wpz4jjV%3DwP5y%2BDjyqJ%
> 2BNFKmxw4182D9cfxwappCtaA%40mail.gmail.com
> <https://groups.google.com/d/msgid/puppet-dev/CALF7fHZ0wpz4jjV%3DwP5y%2BDjyqJ%2BNFKmxw4182D9cfxwappCtaA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699 x788

-- This account not approved for unencrypted proprietary information --

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-users/CANs%2BFoVZZ6Wr2iW04vOTGpQCyfSwdNTo64prpw6MbN3_feJKxw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[Puppet Users] Re: [Puppet-dev] Draft for new type and provider API

Reply via email to