Hi all,
As I alluded to in a different thread earlier today, after over 2 years
of development, Heka is finally closing in on what we're going to call a
1.0 release. I thought now would be a good time to explain what that
means, exactly, and to point out a few important items that we have on
the roadmap between now and then. Apologies in advance for the wall of
text, I'm erring on the side of completeness here.
First, what does 1.0 mean? Things won't radically change. Heka will
still see updates, bug fixes, and improvements, although those of us on
the core Heka team will probably start to spend more of our time *using*
Heka, and a bit less of our time developing it. The biggest change has
to do with our backwards compatibility guarantees.
So far, we've been using a modified semantic versioning scheme. For
patch versions (e.g. from 0.7 to 0.7.1 to 0.7.2, etc) we only do bug
fixes. We don't introduce new features, much less breaking changes. But
for minor versions (0.7 to 0.8, for instance), we've reserved the right
to introduce backwards incompatibilities. These could be small things
like changing the name of certain config settings, or bigger issues like
changing APIs such that plugin code needs to be updated to continue working.
Once we hit 1.0, we're going to put the brakes on our backwards
incompatible changes. Patch versions will still only contain bug fixes.
Minor versions will contain new features, but will not break any
existing features. Breakage will only happen when we bump major versions
(e.g. 1.x.x to 2.0), and we will make a point of deprecating settings
and/or features, so there's at least one release of overlap between an
older and newer way of doing things, to give users time to adjust to any
changes that are introduced.
Now that the preliminaries are out of the way, we can get to the *real*
reason I'm bringing all of this up now. Because we want to slow down the
rate-of-breaking-changes once we hit 1.0, that means we want to get all
of the breaking changes that we already have on our radar out of the way
*before* that happens. I want to let you all know what we have in mind,
so you can know what to expect, and also to provide feedback. Currently
there are 4 significant changes we want to make, each with an open
issue, conveniently tagged as "breaking change":
https://github.com/mozilla-services/heka/labels/breaking%20change
Here's an overview of what each one means, and what impact it will have:
#424, Abstract out parser registry (aka Introduce "Splitter" plugins)
When we first released Heka, there were 4 plugin types: inputs,
decoders, filters, and outputs. After a while we decided we needed an
inverse to decoders, and encoders became the 5th. For quite some time
now, we've known we want to introduce a 6th plugin type, called
"splitters". Splitters, like decoders, will be tightly coupled with
specific input plugins, and they will be responsible for looking at the
raw data in an input stream, finding the record boundaries of that
stream, and extracting a single record's bytes to be passed on to the
decoder for more thorough parsing. Splitters actually sort of already
exist... many of our input plugins support config options called
`parser_type`, `delimiter`, and `delimiter_location`, which perform this
function. But currently each input has to implement this separately,
there's a lot of code duplication, and introducing new ways to find
record boundaries is a lot harder than it should be. By abstracting them
out as their own plugin type, it will be much easier to make them
automatically available to every new input. It will also be possible to
implement new message framing schemes and make them immediately
available to everybody. The first splitters we introduce will exactly
match the current parser_type options. For most of you, this will just
mean updating your config to use splitters instead of parser types, but
for anyone who may have written their own input plugins there may be
some small changes you need to make to play well with the new behaviour.
#918, Reimplement reporting infrastructure
Currently Heka provides some system wide operational metrics, and it
provides a way for each individual plugin to provide a custom set of
operational metrics. All of this generated data is made visible to the
user in the DashboardOutput's HTML UI. One problem, though, is that
there are certain data points we want *every* plugin to provide, such as
# of messages processed, # of processing failures, sampled average
message processing time, etc. Right now every plugin has to implement
this by hand and explicitly include the data in its custom report
output. Some plugins do this, but many others don't, which is why the
"messages processed" value in the dashboard is empty for many plugins,
even when messages are flowing. Clearly this isn't ideal, Heka should
handle as much of this as possible automatically. Getting to this point
will require changing some of how the reporting works, so less of it is
handled by the plugins themselves and more of it is handled by the
plugin runners that Heka provides.
This won't change the config format at all, and most plugins will
continue to work unmodified. Any plugins that you have that are
currently providing their own custom report output will need to be
changed to adjust to the new reporting APIs we write. Also, while
counting the messages processed will come for free, counting processing
errors and sampling average processing time may still require a small
amount of cooperation from the plugins themselves, so there may be
slight changes required to get the most out of the new reporting structure.
#930, Simplify Output plugins to only deal w/ output transport
This is the biggest of the changes. Originally, outputs were
responsible for serializing their data themselves. Then we introduced
encoders to handle that. *Then* we realized that, even though encoders
serialize a single message, the output should be the one to specify
whether or not framing happens, so we now recommend that outputs call
`OutputRunner.Encode()`, which first delegates to the encoder and then
applies any desired framing.
Now we've realized that, since the OutputRunner is doing the encoding
work anyway, it might make sense for this to happen automatically,
before the output plugin is even invoked. What if output plugins didn't
receive message objects, but instead received bytes data that had
already been framed (if necessary) and serialized. This reduces the
burden of responsibility for each output plugin, b/c it no longer has to
concern itself w/ the details of encoding, Heka will take care of that
automatically based on the config.
This provides some additional benefits. Currently, the TcpInput uses
a disk queue to make sure it doesn't lose data if the connection drops.
But ideally *any* output plugin would be able to use a disk queue, and
the cursor wouldn't advance in the queue until the data was confirmed as
delivered. With this change in the design, implementing this would be
much easier, any output could automatically support a `use_buffering`
option. If true, data would be routed through the disk queue before it
even got to the output, and the output would just have to report back
re: whether delivery succeeded (so we can advance in the queue) or not
(so we can retry the last one again).
Clearly this change significantly impacts all output plugins, and
there are still a few rough edges to work out, but ultimately we think
the wins are worth it. I'm curious what others think.
#1116, Improve Decoder config API
This is the last one, and it's much smaller in scope than the output
changes. Right now, when an output specifies an encoder, Heka
automatically notices this, creates the encoder plugin, and makes it
available to the output plugin. All the output has to do is call
OutputRunner.Encode(), and if we implement #930, then soon it won't have
to do even that.
For inputs, the story isn't as good. Inputs have to explicitly
include `decoder` as a config option, which they have to parse and
validate, and then they have to bootstrap the decoder by hand. This is
stuff that Heka should be doing for you. The impact here is that any
input that uses decoders (which is most of them) would need to change
slightly, to remove boilerplate code.
And there you have it. If you made it this far, I salute you. Hopefully
you found it useful. If you have questions or comments on any of these
ideas, please respond on the list and we'll be happy to discuss.
Thanks!
-r
_______________________________________________
Heka mailing list
Heka@mozilla.org
https://mail.mozilla.org/listinfo/heka