On Mon, Feb 27, 2017 at 3:59 PM, Eric Sorenson <eric.soren...@puppet.com> wrote:
> Hi all - we're nearing the end of the Puppet 4.x series feature > development. It's been almost two years since Puppet 4.0 dropped and it > seems like an opportune time to start thinking about the next semver major. > > There was some discussion last year[0], but the development work is truly > rolling forward now, so I wanted to restart the conversation about Puppet 5 > to elicit feedback and make sure to incorporate the community's needs into > the plan. > > The headline here is that the core open-source "Puppet Platform" > (puppet-agent, puppet-server, puppetdb) are moving to a more coordinated > release model, with compatibility guarantees and consistent versioning > among the components. The first release of this "Puppet Platform 5", > currently targeted at May, will bring these components' major versions > together and provide some nice features without a huge > backwards-incompatible break. > > A couple of FAQs, or rather questions I imagine will be frequently asked: > > Q: Puppet 5, what the hell eric0?! I just spent a month updating my code > to run under Puppet 4. > A: No Puppet code that works under Puppet 4 needs changing[1] to work > under 5. This is a semver major to release some backwards-incompatible > changes that have stacked up, plus some additional feature work, but does > not affect the language. Puppet 4 won't be EOL any time soon (and we're > guaranteeing commercial customer support until 2018) but we've got to keep > the platform moving forward. Plus, it seems like a good opportunity to > eliminate the confusion caused by "Puppet 4" being delivered in packages, > split between puppet-agent-1.x and puppet-server-2.x .... > > Q: So what *is* in it? Why should I upgrade? > A: Lots of good stuff. Hiera 5 with eyaml is built-in; it's UTF-8 clean; > network comms are pure, sweet, fast JSON. > For Puppet 5, we want to make JSON the default serialization format for communication between puppet agent <-> server and server <-> puppetdb, while providing a migration path so older agents (v3/4) can continue communicating with Puppet 5 masters using PSON. This should improve performance for compile masters, provide better internalization support, and ensure JSON interoperability. Some background. In Puppet 3.2.2, we switched from YAML to PSON as the default serialization format for network communication due to security issues with YAML <https://puppet.com/security/cve/cve-2013-3567>. PSON is a 7+ year old version <https://github.com/puppetlabs/puppet/commit/bca3b70> of pure_json plus puppet patches <https://github.com/puppetlabs/puppet/commits/master/lib/puppet/external/pson>. This results in a number of problems: 1. PSON is slow - The PSON parser and generator are implemented in "pure" ruby code, and pure_json benchmarks <https://github.com/flori/json#speed-comparisons> show parsing in native code is 26.9 times faster than in ruby, and generation in native code is 12.2 times faster than in ruby. 2. PSON doesn't conform to RFC7159 <https://tools.ietf.org/html/rfc7159> - Puppet added patches that diverge from the specification, e.g. see commit 3c56705a <https://github.com/puppetlabs/puppet/commit/3c56705a>. Also, the JSON specification has evolved since RFC4627 <https://tools.ietf.org/html/rfc7159#appendix-A>. 3. Incomplete Unicode support - pure_json 1.1.9 was released at a time when ruby barely supported string encodings (1.8.6 and 1.9.1 had just been released). Since then, the upstream pure_json library evolved and added unicode support. We backported some unicode fixes to our vendored implementation, e.g. see commit 8306c5 <https://github.com/puppetlabs/puppet/commit/8306c5>, but I don't know that it's the complete set of changes necessary for internationalization. 4. Lossy conversions - due to our non-compliant implementation, puppetdb sometimes receives invalid UTF-8 content. Puppetdb will coerce <https://github.com/puppetlabs/puppetdb/blob/90babd07e222836d2e12cecc186aebc56697f1ea/puppet/lib/puppet/util/puppetdb/char_encoding.rb#L132-L135> the data using the Unicode replacement character <http://www.fileformat.info/info/unicode/char/fffd/index.htm>, but it is a lossy conversion. 5. Duplicated code - Ruby 1.9.3 and up vendors pure_json with native libraries! So somewhat ironically, puppet is using a slow, outdated, and non-compliant JSON implementation, when the better replacement is already in ruby bundled in our AIO packages. *Proposal* 1. Puppet 5 agents should accept and prefer JSON content, identified by the "application/json" content type. Agents should continue to accept PSON when taking to older masters. 2. Puppetserver 5 should accept requests with "application/json" and "pson" content types, and return responses in the appropriate format. Puppetserver needs to continue accepting PSON so that older agents (v3/4) can communicate. 3. Puppetdb terminus and puppetdb 5 should use JSON instead of PSON. 4. It should be possible to configure a Puppet 5 agent to use PSON when talking to an older puppetmaster, most likely using the existing "preferred_serialization_format" setting. This is primarily needed when sending reports, similar to what we did when switching from YAML to PSON in Redmine 21427 <https://projects.puppetlabs.com/issues/21427> and PR 1869 <https://github.com/puppetlabs/puppet/pull/1869>. 5. The agent currently PSON encodes facts, CGI escapes them in the body of the catalog request, and sets the content-type to application/x-www-form- urlencoded. Puppet 5 agents should inline the facts as is, set `facts_format => identity`, and generate the catalog request body as JSON with content-type application/json. 6. Modify "console" format to use JSON instead of PSON, but preserve existing pretty-print formatting behavior. 7. In a future major release (6 or later), remove PSON. Alternatively, alias PSON as JSON so that any modules relying on PSON directly don't break. *Alternatives* We considered making MessagePack the default. However, MessagePack is a binary protocol, which could be an issue for interoperability, e.g. curl. And if MessagePack can't be used, then we have to fallback to PSON, with all of its issues. Also MessagePack only understands bytes not characters, so it's easier for non-compliant clients to send invalid UTF-8 data. Finally, MessagePack is known for being "space-efficient <https://gist.github.com/frsyuki/2908191>", e.g. storing data in memcached, which isn't a problem we're trying to optimize for. Most likely JSON combined with gzip compression on the wire will provide sufficient performance <https://news.ycombinator.com/item?id=4090831>. *What We're Not Doing (Yet)* Years ago, we talked about switching everything in puppet from YAML to JSON <https://tickets.puppetlabs.com/browse/PUP-3524>. While it's attractive from a simplicity/consistency perspective, we don't want to break compatibility. So for now, we're going to continue using YAML for files that the agent stores locally on disk, e.g. last_run_report. We're not removing PSON any time soon, as we'll need to support old agents talking to newer masters for "awhile". Supporting JSON won't solve the "binary data in the catalog <https://tickets.puppetlabs.com/browse/PUP-3600>" problem. However, there are two current options, enable Message Pack or use the Binary type <https://tickets.puppetlabs.com/browse/PUP-5831> recently added to the Puppet language. In the future, puppet's "rich data" feature will allow transferring binary data in the catalog. > Our current Ruby versions are EOL'ed, so we're moving to MRI Ruby 2.4 on > the agent and jruby9k on the server. The PE-only puppet-server metrics > service is getting some enhancements and will be open-sourced. > > Q: How's it going to be delivered? Are Puppet Collections still a thing? > A: Funny you should ask. As we kicked around a couple of months ago[3], > it's been two years and the collections idea just hasn't worked out in > practice, so it seems wise to iterate and keep evolving. The current plan > is to create a new repo, parallel with the existing PC1 repos, simply named > 'puppet'. The platform components will roll into it and future > semver-majors will be coordinated across the components, hopefully leading > to smaller, easily digestible chunks of change. > > You can see the complete list of changes (which will evolve as we gather > feedback and adjust scope) at this JIRA query[2]. If there's anything on > the roster that looks like it'll break your world — or, conversely, if you > want to nominate a change that's important to you but isn't currently on > the list — this thread is the place to do that. > > --eric0 > > [0] https://groups.google.com/d/msg/puppet-dev/RHa2tMPRTx4/sA8RX_gS1ogJ > [1] I'm reserving a tiny, tiny asterisk for some Ruby extensions that use > internal APIs that may change, like pre-Puppet 4.9 lookup extensions. > [2] https://tickets.puppetlabs.com/issues/?filter=12940 > [3] https://groups.google.com/d/topic/puppet-dev/3-HSUz5OnHg/discussion > > Eric Sorenson - eric.soren...@puppet.com > director of product, ecosystem and platform > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to puppet-dev+unsubscr...@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/ms > gid/puppet-dev/6FF107F6-088A-43E2-BECF-772D8C153C49%40puppet.com > <https://groups.google.com/d/msgid/puppet-dev/6FF107F6-088A-43E2-BECF-772D8C153C49%40puppet.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- Josh Cooper Developer, Puppet -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CA%2Bu97ukg8c%3DMp8v0_SHGDeBGrLN8MRg8xhLBPQe1BSMTJKmm3A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.