On 2014-24-10 2:04, Andy Parker wrote:
A while ago we removed support for puppet to *send* YAML on the network.
At the same time we converted to using safe_yaml for receiving YAML in
order to keep compatibility with existing agents. Instead of YAML all of
the communication was done with PSON, which is a variant of JSON that
has been in use in puppet since at least 2010. As far as I understand
PSON started out as simply a vendored version of json_pure. The name
PSON was apparently because rails would try to patch anything named
JSON, and so they needed to name it something different to stop that
from happening (that is all hearsay, so I don't know how truthful it is).

Over time PSON started to evolve. Little changes were made to it here
and there. The largest change came about because of
http://projects.puppetlabs.com/issues/5261. The changes for that ticket
removed the restriction that only valid UTF-8 could be sent in PSON,
which opened the door to a) binary data as file contents and b)
absolutely no control over what encodings puppet was using. Over time
there have been a large number of issues that have been related to not
keeping track of what encoding puppet is dealing with.

I'd like to move us away from PSON and onto a standard format. YAML is
out of the question because it is either slow and unsafe (all of the
YAML vulnerabilities) or extremely slow and safe (safe_yaml).
MessagePack might be nice. It is pretty well specified, has a fairly
large number of libraries written for it, but it doesn't do much to help
us solve the wild west of encoding in puppet. In MessagePack there
aren't really any enforcements of string encodings and everything is
treated as an array of bytes.

In order to keep consistency across various puppet projects we'll be
going with JSON. JSON requires that everything is valid UTF-8, which
gives us a nice deliberateness to handling data. JSON is pretty fast
(not as fast as MessagePack) and there are a lot of libraries if it
turns out that the built in json isn't fast enough (puppet-server could
use jrjackson, for instance).

So what all would be changing?

   1. Network communication that is using PSON would move to JSON
   2. YAML files that the master and agent write would move to JSON
(node, facts, last_run_summary, state, etc.).
   3. A new exec node terminus would be written to handle JSON, or the
existing one would be updated (check the first byte for '{').

That is just some of the changes that will need to happen. There will be
a ripple of other changes based on the fact that JSON has to be UTF-8.

   1. A new "encoding" parameter on File and a base64() function.. This
will allow transferring non-UTF-8 data as file content until we can get
a new catalog structure that allows tracking data types and more changes
to the language to differentiate Strings from Blobs.

I would like us to add a Binary datatype upfront instead of doing the base64 encoding in the puppet code. Instead, it is the serialization formats responsibility to transform it into a form that can be transported. A JSON in text form can then do the base64 encoding. A MsgPack / JSON can instead use the binary directly.

Even if our first cut of this always performs a base64 encoding the user logic does not have to change.

Thus, instead of calling base64(content) and setting the encoding in the File resource, a Binary is created directly with a binary(encoding, content) function.

- henrik

--

Visit my Blog "Puppet on the Edge"
http://puppet-on-the-edge.blogspot.se/

--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/m2c9o1%2492v%241%40ger.gmane.org.
For more options, visit https://groups.google.com/d/optout.

Reply via email to