Hi,

I have rewritten the HDFS event sink's HDFSFormatterFactory so that I
can use my own FlumeFormatter implementation to write events to
SequenceFiles. I did this because the standard HDFSWritableFormatter
discards all headers apart from the timestamp, whereas I want to write
all headers to HDFS.

The code is available for review
here:<https://github.com/cb372/flume/tree/custom-hdfs-formatter>

https://github.com/cb372/flume/tree/custom-hdfs-formatter

To use it, you just pass in the FlumeFormatter implementation's
classname in the config, similar to the way you specify a custom
EventSerializer.

e.g.
agent_foo.sinks.hdfs-sink.writeFormat=com.mycompany.flume.MyCustomFormatter

The class must have a public zero-argument constructor.

Please let me know what you think,

Chris.

PS: I would have filed a Jira (maybe there already is one?), but the
Jira server is down at the moment.

Reply via email to