On Fri, Oct 24, 2014 at 10:48 AM, Luke Kanies <[email protected]> wrote:
> On Oct 24, 2014, at 9:59 AM, Andy Parker <[email protected]> wrote: > > On Fri, Oct 24, 2014 at 2:47 AM, Erik Dalén <[email protected]> > wrote: > >> On 24 October 2014 03:24, Henrik Lindberg <[email protected] >> > wrote: >> >>> On 2014-24-10 2:04, Andy Parker wrote: >>> >>>> A while ago we removed support for puppet to *send* YAML on the network. >>>> At the same time we converted to using safe_yaml for receiving YAML in >>>> order to keep compatibility with existing agents. Instead of YAML all of >>>> the communication was done with PSON, which is a variant of JSON that >>>> has been in use in puppet since at least 2010. As far as I understand >>>> PSON started out as simply a vendored version of json_pure. The name >>>> PSON was apparently because rails would try to patch anything named >>>> JSON, and so they needed to name it something different to stop that >>>> from happening (that is all hearsay, so I don't know how truthful it >>>> is). >>>> >>>> Over time PSON started to evolve. Little changes were made to it here >>>> and there. The largest change came about because of >>>> http://projects.puppetlabs.com/issues/5261. The changes for that ticket >>>> removed the restriction that only valid UTF-8 could be sent in PSON, >>>> which opened the door to a) binary data as file contents and b) >>>> absolutely no control over what encodings puppet was using. Over time >>>> there have been a large number of issues that have been related to not >>>> keeping track of what encoding puppet is dealing with. >>>> >>>> I'd like to move us away from PSON and onto a standard format. YAML is >>>> out of the question because it is either slow and unsafe (all of the >>>> YAML vulnerabilities) or extremely slow and safe (safe_yaml). >>>> MessagePack might be nice. It is pretty well specified, has a fairly >>>> large number of libraries written for it, but it doesn't do much to help >>>> us solve the wild west of encoding in puppet. In MessagePack there >>>> aren't really any enforcements of string encodings and everything is >>>> treated as an array of bytes. >>>> >>>> In order to keep consistency across various puppet projects we'll be >>>> going with JSON. JSON requires that everything is valid UTF-8, which >>>> gives us a nice deliberateness to handling data. JSON is pretty fast >>>> (not as fast as MessagePack) and there are a lot of libraries if it >>>> turns out that the built in json isn't fast enough (puppet-server could >>>> use jrjackson, for instance). >>>> >>>> So what all would be changing? >>>> >>>> 1. Network communication that is using PSON would move to JSON >>>> 2. YAML files that the master and agent write would move to JSON >>>> (node, facts, last_run_summary, state, etc.). >>>> 3. A new exec node terminus would be written to handle JSON, or the >>>> existing one would be updated (check the first byte for '{'). >>>> >>>> That is just some of the changes that will need to happen. There will be >>>> a ripple of other changes based on the fact that JSON has to be UTF-8. >>>> >>>> 1. A new "encoding" parameter on File and a base64() function.. This >>>> will allow transferring non-UTF-8 data as file content until we can get >>>> a new catalog structure that allows tracking data types and more changes >>>> to the language to differentiate Strings from Blobs. >>>> >>> >>> I would like us to add a Binary datatype upfront instead of doing the >>> base64 encoding in the puppet code. Instead, it is the serialization >>> formats responsibility to transform it into a form that can be transported. >>> A JSON in text form can then do the base64 encoding. A MsgPack / JSON can >>> instead use the binary directly. >>> >>> Even if our first cut of this always performs a base64 encoding the user >>> logic does not have to change. >>> >>> Thus, instead of calling base64(content) and setting the encoding in the >>> File resource, a Binary is created directly with a binary(encoding, >>> content) function. >>> >> >> How do you differentiate between an encoded binary string and a regular >> string in the JSON though? >> You would need some sort of annotation, and if that is inside the string >> (which it is in the content parameter of files already btw) you might need >> a way to escape it to be able to have a regular string that contains that >> annotation stuff. >> > > I talked to Henrik about this and his idea is that we make file content a > special case. We write a binary() function that takes a String and produces > a hash of { "encoding" => ..., "data" => ... } (or something like that) in > the serialized form. Then the file content is written to allow either a > string or a hash of that structure. We could even implement this as a type > in the puppet language and update the serializer to do that. Perhaps we > should also create a new binary_file() function so that non-UTF-8 values > don't leak in via file(). > > > Can’t we switch file serving to just do raw downloads? Why do they even > need encoding at all? > > File serving is already done that way. We switched file buckets to that system a few releases ago as well, IIRC. The problem isn't the file server or the file bucket, but file resources in manifests that have a "content" parameter with non-UTF-8 data. > Especially if we focus on getting the static catalog to work, all file > serving turns into a plain HTTP get, and it should skip all of the Puppet > transfer, encoding, etc. > > The static compiler deals with the source parameter, not the content parameter (although it could I suppose). The current implementation also has the problem that it takes over the content parameter for another meaning, which has caught out several people (try to save a file that has content => "{md5}abdefabcdef"). > -- > http://puppetlabs.com/ | http://about.me/lak | @puppetmasterd > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/puppet-dev/FC42257B-5129-4E2D-9DED-4D5AE6888740%40puppetlabs.com > <https://groups.google.com/d/msgid/puppet-dev/FC42257B-5129-4E2D-9DED-4D5AE6888740%40puppetlabs.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- Andrew Parker [email protected] Freenode: zaphod42 Twitter: @aparker42 Software Developer *Join us at **PuppetConf 2015, October 5-9 in Portland, OR - * http://2015.puppetconf.com *Register early to save 40%!* -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CANhgQXt9LpaSZXas5q6kWTzUgmAS4Qpdv-sWbqHvDx6TvADRBA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
