Re: [heka] big change landing soon: Encoder plugins

Rob Miller Tue, 13 May 2014 16:16:31 -0700

And this is now merged to dev. Please let us know if you have anyproblems or questions with the Encoder functionality!

-r



On 05/12/2014 02:50 PM, Rob Miller wrote:

Hi!

Just wanted to give everyone a heads up that a significant change will
be landing on the dev branch of Heka soon, to be included in the
eventual 0.6 release.

This change will introduce a new "Encoder" plugin type. Encoders are the
inverse of the already existing Decoder plugin type. Decoders are used
by Input plugins to convert arbitrary input data into a Heka message
struct, Encoders are used by Output plugins to convert Heka message
structs into arbitrary output data.

We realized this was a necessary abstraction when we saw that various
different Output plugins were implementing their own ways to manage the
serialization. For instance, currently:

- FileOutput supports a 'format' config option which can be one of
'text', 'json', or 'protobufstream'. The 'text' option includes only the
message payload. The 'json' format contains all of the message fields,
but is an inflexible format.

- LogOutput supports a 'payload_only' boolean config option. If true,
then the message payload will be written to stdout. If false, then a
custom, inflexible text rendering of the message data will be generated.

- ElasticSearchOutput supports a 'format' config option which can be one
of 'clean', 'logstash_v0', 'payload', or 'raw'. All of them generate
JSON in a specific format, except 'payload', which presumes that the
message payload already contains the JSON that you want to send to
ElasticSearch.

- TcpOutput has no flexibility, it only generates Protocol Buffer
encoded message streams.

The introduction of Encoder plugins means all of these one-off
serialization strategies can go away. Instead, you'll add an Encoder
config section to your TOML config, and then you'll refer to configured
Encoder sections from your Output config sections. So what would have
been this:

     [LogOutput]
     payload_only = true

Instead will be this:

     [PayloadEncoder]

     [LogOutput]
     encoder = "PayloadEncoder"

The initial code will include three encoders: ProtobufEncoder (generates
Heka's native protocol buffer streams), PayloadEncoder (extracts message
payload), and SandboxEncoder (lets you use Lua code to extract data from
a message and generate whatever output you want). There may be more
coming in the future, but really our hope is that the SandboxEncoder
will meet most of your needs.

Also, initially the TcpOutput, LogOutput, and FileOutput have been
modified to use Encoders instead of their previous mechanisms. TcpOutput
defaults to use of ProtobufEncoder, which exactly matches the previous
behavior, so no config changes should be necessary. If you're using
LogOutput or FileOutput, however, when you upgrade you'll need to modify
your config to include appropriate Encoder plugins and to make sure
they're being used by your outputs.

Anyone interested in digging in to the code can take a look at the open
pull request at https://github.com/mozilla-services/heka/pull/838. It's
currently awaiting code review, which might result in further small
revisions, but we definitely expect to land this on dev over the next
few days. I'll send another note out when it lands.

If you made it this far, wow, I'm impressed! Thanks for your attention,
hope the Encoder plugins work for you, and please let us know if you
have any questions or issues.

Thanks!

-r
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Re: [heka] big change landing soon: Encoder plugins

Reply via email to