Re: [Puppet-dev] RFC - A specification for module schemas

Corey Osman Sat, 30 Jan 2016 10:47:23 -0800


On Friday, January 29, 2016 at 10:47:48 PM UTC-8, R.I. Pienaar wrote:
>
>
>
> ----- Original Message ----- 
> > From: "Corey Osman" <co...@logicminds.biz <javascript:>> 
> > To: "puppet-dev" <puppe...@googlegroups.com <javascript:>> 
> > Sent: Saturday, January 30, 2016 5:45:05 AM 
> > Subject: [Puppet-dev] RFC - A specification for module schemas 
>
> > Hi, 
> > 
> > I wanted to bring up a conversation in hopes that we as a community can 
> create a 
> > specification for something I am calling module schemas.  Before I get 
> into 
> > that I want to provide a little background info. 
> > 
> > This all started a few years ago when hiera first came out. Data 
> seperation in 
> > the form of parameters and auto hiera lookups quickly became the norm 
> and 
> > reusable modules exploded into what the forge is today .  Because of the 
> > popularity of hiera, data validation is now a major problem though. 
>  Without 
> > good data, excellent modules become useless. 
> > 
> > Puppet 4 and stdlib brought many new functions and ways to validate 
> incoming 
> > data, and I consider puppet 4 to now be a loosely typed language now.   
> Hell, 
> > there was even this a long time ago: 
> > https://github.com/puppetlabs/puppetlabs-kwalify 
> > <https://github.com/puppetlabs/puppetlabs-kwalify>  But puppet only 
> does so 
> > much, and while having validation reside in code might make 
> troubleshooting a 
> > snap, there is still a delay in the feedback loop when the code is 
> tightly 
> > coupled with an external “database” of data.  Data that is inserted by 
> non 
> > puppet developers who don’t know YAML or data structures. 
> > 
> > So with that said I want to introduce something new to puppet module 
> > development, called module schemas.  A module schema is a specification 
> that 
> > details the inner workings of a module.   For right now this means a 
> detailed 
> > specification of all the parameters for classes and definitions used 
> inside a 
> > module who’s goal is to make it impossible to insert a bad data 
> structure.  But 
> > ideally, we can specify so much more (functions, types, providers, 
> templates) 
> > even hiera calls in weird places like templates and functions, which are 
> > usually things that do not get documented and are hard to reference and 
> usually 
> > requires looking at source code. 
> > 
> > What does such a schema look like? 
> > 
> > Here is a example schema for the apache module which contains 446 
> parameters!. 
> > 
> https://github.com/logicminds/puppet_module_schemas/blob/master/apache_schema.yaml
>  
>
> This in general is something I've wanted for a long time, and I think 
> we're almost 
> getting for free now in Puppet 4 
>
> In Puppet 4 you can do: 
>
>    class x(String $y) { } 
>
> or 
>
>    class x(String $y[1,10]) { } 
>
> or 
>
>    class x(Pattern[/\A[a-z].*/]) { } 
>
> or 
>    class x(Enum["stopped", "running"] $y) { } 
>
> and many more including very complex matchers.  This is a lot more 
> featureful AND 
> maps 1:1 to the capabilities puppet has natively. 
>
 
This is one drawback of using an external schema parser, puppet has way 
more useful types to check against. Of course Puppet 3 only has the basics 
(bool, string, array, hash).   I have thought about forking the kwalify 
parser and making more data types so it would be more aware of some puppet 
data types  (absolute path, cert_type, ...).  I could go down that route, 
but I would probably be the only maintainer.



 

>
> I think there are ways now to introspect the classes and extract this 
> metadata 
> automagically, if not then I think *that* is the feature we should get 
> added to 
> Puppet and from there build the external validation, introspection and 
> testing 
> for data as that will give a solution that progresses as Puppet does and 
> give a 
> lot more "real" results than trying to map this stuff externally to what 
> Puppet 
> supports 
>
> The puppet lookup or similar CLI can be extended to include validation. 
>

While having this built into puppet would be ideal, there are still people 
on 2.7, and many more on 3.x so it might take some time to migrate them to 
4.3.x.  Not to mention almost all forge modules don't include type checking 
in fear that they will discriminate against 3.Xers. (At least thats how I 
feel. Internal private modules are a different story. )

Having a tool external to puppet means that it is version independent. You 
don't have to upgrade to puppet 4.X to get validation. I think this alone 
is a very good use case. I also believe there is room for an internal 
puppet tool as well which would eventually replace the external tool. 
Furthermore, having an external schema also means that when you do upgrade 
to puppet 4.x you can map your external schema to puppet data types and 
update 3.x code to utilize data types with a tool to retrofit those 
additions automatically. 

 

>
>
> > 
> > The most immediate use case for such a schema is hiera validation as I 
> have 
> > outlined here: 
> http://logicminds.github.io/blog/2016/01/16/testing-hiera-data 
> > <http://logicminds.github.io/blog/2016/01/16/testing-hiera-data>. 
>  Which works 
> > AWESOME!.  We are validating hiera data and not YAML and doing it under 
> 500 ms 
> > for every commit on every single file. 
> > 
> > As a community we need a solution for validating hiera data.  Its my 
> belief that 
> > schemas are the way to go.   After all hiera data is now in modules with 
> no way 
> > to easily validate. 
> > 
> > Other use cases that come to mind: 
> > 
> >  - generating documentation (Many modules on the forge usually contain a 
> static 
> >  map of parameters used inside the module).   If a schema was present, 
> we could 
> >  just generate that same map automatically. 
> >   
> >  - useful for other 3rd party tools like puppet strings 
> >   
> >  Parameter specification lookup 
> >  - Imagine a  face that shows internal puppet module specifications.  I 
> am not 
> >  talking about puppet-strings, this would detail the parameters given a 
> class, 
> >  or an example parameter value given a parameter name. 
> >     
> >    Scenario: 
> >      - puppet module puppetlabs/apache   (outputs all the parameters, 
> classes for 
> >      that module) in a specified format (json or yaml) 
> >      - puppet module puppetlabs-apache::class_name (outputs all the 
> parameters for 
> >      the class in a specified format (json or yaml) 
> >      - puppet module puppetlabs-apache::class_name::param1  (outputs an 
> example value 
> >      for that parameter, as well as the default value) in a specified 
> format (json 
> >      or yaml) 
> > 
> > Foreman and Puppet Console need this level of detail as well. 
>  Currently, both 
> > of these solutions spend quite a bit of time parsing code to show 
> parameters 
> > for UI display.   It would be much easier if a schema was available that 
> > detailed this level of data.  Think of the speed improvements that could 
> be had 
> > if this information was “cached” in a file.   These solutions currently 
> load or 
> > intelligently scan all the puppet code for every puppet environment to 
> get the 
> > parameters and defaults. 
> > 
> > Here is how we can create a schema 
> > 
> http://logicminds.github.io/blog/2016/01/15/how-to-build-a-module-schema/ 
> > <
> http://logicminds.github.io/blog/2016/01/15/how-to-build-a-module-schema/> 
>
> > (which I even automated with retrospect-puppet 
> > (https://github.com/nwops/puppet-retrospec.git 
> > <https://github.com/nwops/puppet-retrospec.git>) 
> > 
> > However,  we all need to agree on something before schemas can ever be a 
> > “thing”.  We need a schema for module schemas.  This is important 
> because as 
> > soon as 3rd party tools or scripts start to use schemas and later we 
> decide the 
> > schema needs changing, everything breaks.  Tools need a specification to 
> work 
> > from. 
> > 
> > So with this in mind and an example schema here: 
> > 
> https://github.com/logicminds/puppet_module_schemas/blob/master/apache_schema.yaml
>  
> > <
> https://github.com/logicminds/puppet_module_schemas/blob/master/apache_schema.yaml>.
>  
>
> > How can this be improved?  What should we add? 
> > 
> > About the only change I was pondering was adding another object for the 
> types 
> > themselves. 
> > 
> https://github.com/logicminds/puppet_module_schemas/blob/master/specification_with_types.yaml
>  
> > <
> https://github.com/logicminds/puppet_module_schemas/blob/master/specification_with_types.yaml>
>  
>
> > 
> > What are your thoughts?  What steps do we need to take to make this a 
> supported 
> > specification?  What would you desire in a module schema? 
> > 
> > Am I the only one that thinks this is a killer solution? 
> > 
> > 
> > Corey Osman 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> >       
> >   
> > 
> > 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups 
> > "Puppet Developers" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email 
> > to puppet-dev+...@googlegroups.com <javascript:>. 
> > To view this discussion on the web visit 
> > 
> https://groups.google.com/d/msgid/puppet-dev/27236109-21A1-461F-B02D-10ACAB9D3118%40nwops.io.
>  
>
> > For more options, visit https://groups.google.com/d/optout. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/d9e76e84-45be-445c-90b1-0d0f25555ff1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Puppet-dev] RFC - A specification for module schemas

Reply via email to