On Fri, Oct 24, 2014 at 10:48 AM, Luke Kanies <[email protected]> wrote:

> On Oct 24, 2014, at 9:59 AM, Andy Parker <[email protected]> wrote:
>
> On Fri, Oct 24, 2014 at 2:47 AM, Erik Dalén <[email protected]>
> wrote:
>
>> On 24 October 2014 03:24, Henrik Lindberg <[email protected]
>> > wrote:
>>
>>> On 2014-24-10 2:04, Andy Parker wrote:
>>>
>>>> A while ago we removed support for puppet to *send* YAML on the network.
>>>> At the same time we converted to using safe_yaml for receiving YAML in
>>>> order to keep compatibility with existing agents. Instead of YAML all of
>>>> the communication was done with PSON, which is a variant of JSON that
>>>> has been in use in puppet since at least 2010. As far as I understand
>>>> PSON started out as simply a vendored version of json_pure. The name
>>>> PSON was apparently because rails would try to patch anything named
>>>> JSON, and so they needed to name it something different to stop that
>>>> from happening (that is all hearsay, so I don't know how truthful it
>>>> is).
>>>>
>>>> Over time PSON started to evolve. Little changes were made to it here
>>>> and there. The largest change came about because of
>>>> http://projects.puppetlabs.com/issues/5261. The changes for that ticket
>>>> removed the restriction that only valid UTF-8 could be sent in PSON,
>>>> which opened the door to a) binary data as file contents and b)
>>>> absolutely no control over what encodings puppet was using. Over time
>>>> there have been a large number of issues that have been related to not
>>>> keeping track of what encoding puppet is dealing with.
>>>>
>>>> I'd like to move us away from PSON and onto a standard format. YAML is
>>>> out of the question because it is either slow and unsafe (all of the
>>>> YAML vulnerabilities) or extremely slow and safe (safe_yaml).
>>>> MessagePack might be nice. It is pretty well specified, has a fairly
>>>> large number of libraries written for it, but it doesn't do much to help
>>>> us solve the wild west of encoding in puppet. In MessagePack there
>>>> aren't really any enforcements of string encodings and everything is
>>>> treated as an array of bytes.
>>>>
>>>> In order to keep consistency across various puppet projects we'll be
>>>> going with JSON. JSON requires that everything is valid UTF-8, which
>>>> gives us a nice deliberateness to handling data. JSON is pretty fast
>>>> (not as fast as MessagePack) and there are a lot of libraries if it
>>>> turns out that the built in json isn't fast enough (puppet-server could
>>>> use jrjackson, for instance).
>>>>
>>>> So what all would be changing?
>>>>
>>>>    1. Network communication that is using PSON would move to JSON
>>>>    2. YAML files that the master and agent write would move to JSON
>>>> (node, facts, last_run_summary, state, etc.).
>>>>    3. A new exec node terminus would be written to handle JSON, or the
>>>> existing one would be updated (check the first byte for '{').
>>>>
>>>> That is just some of the changes that will need to happen. There will be
>>>> a ripple of other changes based on the fact that JSON has to be UTF-8.
>>>>
>>>>    1. A new "encoding" parameter on File and a base64() function.. This
>>>> will allow transferring non-UTF-8 data as file content until we can get
>>>> a new catalog structure that allows tracking data types and more changes
>>>> to the language to differentiate Strings from Blobs.
>>>>
>>>
>>> I would like us to add a Binary datatype upfront instead of doing the
>>> base64 encoding in the puppet code. Instead, it is the serialization
>>> formats responsibility to transform it into a form that can be transported.
>>> A JSON in text form can then do the base64 encoding. A MsgPack / JSON can
>>> instead use the binary directly.
>>>
>>> Even if our first cut of this always performs a base64 encoding the user
>>> logic does not have to change.
>>>
>>> Thus, instead of calling base64(content) and setting the encoding in the
>>> File resource, a Binary is created directly with a binary(encoding,
>>> content) function.
>>>
>>
>> How do you differentiate between an encoded binary string and a regular
>> string in the JSON though?
>> You would need some sort of annotation, and if that is inside the string
>> (which it is in the content parameter of files already btw) you might need
>> a way to escape it to be able to have a regular string that contains that
>> annotation stuff.
>>
>
> I talked to Henrik about this and his idea is that we make file content a
> special case. We write a binary() function that takes a String and produces
> a hash of { "encoding" => ..., "data" => ... } (or something like that) in
> the serialized form. Then the file content is written to allow either a
> string or a hash of that structure. We could even implement this as a type
> in the puppet language and update the serializer to do that. Perhaps we
> should also create a new binary_file() function so that non-UTF-8 values
> don't leak in via file().
>
>
> Can’t we switch file serving to just do raw downloads?  Why do they even
> need encoding at all?
>
>
File serving is already done that way. We switched file buckets to that
system a few releases ago as well, IIRC. The problem isn't the file server
or the file bucket, but file resources in manifests that have a "content"
parameter with non-UTF-8 data.


> Especially if we focus on getting the static catalog to work, all file
> serving turns into a plain HTTP get, and it should skip all of the Puppet
> transfer, encoding, etc.
>
>
The static compiler deals with the source parameter, not the content
parameter (although it could I suppose). The current implementation also
has the problem that it takes over the content parameter for another
meaning, which has caught out several people (try to save a file that has
content => "{md5}abdefabcdef").


> --
> http://puppetlabs.com/ | http://about.me/lak | @puppetmasterd
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Puppet Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/puppet-dev/FC42257B-5129-4E2D-9DED-4D5AE6888740%40puppetlabs.com
> <https://groups.google.com/d/msgid/puppet-dev/FC42257B-5129-4E2D-9DED-4D5AE6888740%40puppetlabs.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Andrew Parker
[email protected]
Freenode: zaphod42
Twitter: @aparker42
Software Developer

*Join us at **PuppetConf 2015, October 5-9 in Portland, OR - *
http://2015.puppetconf.com
*Register early to save 40%!*

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/CANhgQXt9LpaSZXas5q6kWTzUgmAS4Qpdv-sWbqHvDx6TvADRBA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to