Hi all,

As I alluded to in a different thread earlier today, after over 2 years of development, Heka is finally closing in on what we're going to call a 1.0 release. I thought now would be a good time to explain what that means, exactly, and to point out a few important items that we have on the roadmap between now and then. Apologies in advance for the wall of text, I'm erring on the side of completeness here.

First, what does 1.0 mean? Things won't radically change. Heka will still see updates, bug fixes, and improvements, although those of us on the core Heka team will probably start to spend more of our time *using* Heka, and a bit less of our time developing it. The biggest change has to do with our backwards compatibility guarantees.

So far, we've been using a modified semantic versioning scheme. For patch versions (e.g. from 0.7 to 0.7.1 to 0.7.2, etc) we only do bug fixes. We don't introduce new features, much less breaking changes. But for minor versions (0.7 to 0.8, for instance), we've reserved the right to introduce backwards incompatibilities. These could be small things like changing the name of certain config settings, or bigger issues like changing APIs such that plugin code needs to be updated to continue working.

Once we hit 1.0, we're going to put the brakes on our backwards incompatible changes. Patch versions will still only contain bug fixes. Minor versions will contain new features, but will not break any existing features. Breakage will only happen when we bump major versions (e.g. 1.x.x to 2.0), and we will make a point of deprecating settings and/or features, so there's at least one release of overlap between an older and newer way of doing things, to give users time to adjust to any changes that are introduced.

Now that the preliminaries are out of the way, we can get to the *real* reason I'm bringing all of this up now. Because we want to slow down the rate-of-breaking-changes once we hit 1.0, that means we want to get all of the breaking changes that we already have on our radar out of the way *before* that happens. I want to let you all know what we have in mind, so you can know what to expect, and also to provide feedback. Currently there are 4 significant changes we want to make, each with an open issue, conveniently tagged as "breaking change":

https://github.com/mozilla-services/heka/labels/breaking%20change

Here's an overview of what each one means, and what impact it will have:

#424, Abstract out parser registry (aka Introduce "Splitter" plugins)

When we first released Heka, there were 4 plugin types: inputs, decoders, filters, and outputs. After a while we decided we needed an inverse to decoders, and encoders became the 5th. For quite some time now, we've known we want to introduce a 6th plugin type, called "splitters". Splitters, like decoders, will be tightly coupled with specific input plugins, and they will be responsible for looking at the raw data in an input stream, finding the record boundaries of that stream, and extracting a single record's bytes to be passed on to the decoder for more thorough parsing. Splitters actually sort of already exist... many of our input plugins support config options called `parser_type`, `delimiter`, and `delimiter_location`, which perform this function. But currently each input has to implement this separately, there's a lot of code duplication, and introducing new ways to find record boundaries is a lot harder than it should be. By abstracting them out as their own plugin type, it will be much easier to make them automatically available to every new input. It will also be possible to implement new message framing schemes and make them immediately available to everybody. The first splitters we introduce will exactly match the current parser_type options. For most of you, this will just mean updating your config to use splitters instead of parser types, but for anyone who may have written their own input plugins there may be some small changes you need to make to play well with the new behaviour.

#918, Reimplement reporting infrastructure

Currently Heka provides some system wide operational metrics, and it provides a way for each individual plugin to provide a custom set of operational metrics. All of this generated data is made visible to the user in the DashboardOutput's HTML UI. One problem, though, is that there are certain data points we want *every* plugin to provide, such as # of messages processed, # of processing failures, sampled average message processing time, etc. Right now every plugin has to implement this by hand and explicitly include the data in its custom report output. Some plugins do this, but many others don't, which is why the "messages processed" value in the dashboard is empty for many plugins, even when messages are flowing. Clearly this isn't ideal, Heka should handle as much of this as possible automatically. Getting to this point will require changing some of how the reporting works, so less of it is handled by the plugins themselves and more of it is handled by the plugin runners that Heka provides. This won't change the config format at all, and most plugins will continue to work unmodified. Any plugins that you have that are currently providing their own custom report output will need to be changed to adjust to the new reporting APIs we write. Also, while counting the messages processed will come for free, counting processing errors and sampling average processing time may still require a small amount of cooperation from the plugins themselves, so there may be slight changes required to get the most out of the new reporting structure.

#930, Simplify Output plugins to only deal w/ output transport

This is the biggest of the changes. Originally, outputs were responsible for serializing their data themselves. Then we introduced encoders to handle that. *Then* we realized that, even though encoders serialize a single message, the output should be the one to specify whether or not framing happens, so we now recommend that outputs call `OutputRunner.Encode()`, which first delegates to the encoder and then applies any desired framing. Now we've realized that, since the OutputRunner is doing the encoding work anyway, it might make sense for this to happen automatically, before the output plugin is even invoked. What if output plugins didn't receive message objects, but instead received bytes data that had already been framed (if necessary) and serialized. This reduces the burden of responsibility for each output plugin, b/c it no longer has to concern itself w/ the details of encoding, Heka will take care of that automatically based on the config. This provides some additional benefits. Currently, the TcpInput uses a disk queue to make sure it doesn't lose data if the connection drops. But ideally *any* output plugin would be able to use a disk queue, and the cursor wouldn't advance in the queue until the data was confirmed as delivered. With this change in the design, implementing this would be much easier, any output could automatically support a `use_buffering` option. If true, data would be routed through the disk queue before it even got to the output, and the output would just have to report back re: whether delivery succeeded (so we can advance in the queue) or not (so we can retry the last one again). Clearly this change significantly impacts all output plugins, and there are still a few rough edges to work out, but ultimately we think the wins are worth it. I'm curious what others think.

#1116, Improve Decoder config API

This is the last one, and it's much smaller in scope than the output changes. Right now, when an output specifies an encoder, Heka automatically notices this, creates the encoder plugin, and makes it available to the output plugin. All the output has to do is call OutputRunner.Encode(), and if we implement #930, then soon it won't have to do even that. For inputs, the story isn't as good. Inputs have to explicitly include `decoder` as a config option, which they have to parse and validate, and then they have to bootstrap the decoder by hand. This is stuff that Heka should be doing for you. The impact here is that any input that uses decoders (which is most of them) would need to change slightly, to remove boilerplate code.

And there you have it. If you made it this far, I salute you. Hopefully you found it useful. If you have questions or comments on any of these ideas, please respond on the list and we'll be happy to discuss.

Thanks!

-r
_______________________________________________
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka

Reply via email to