On 11/16/2015 12:31 PM, web user wrote:
Thanks again for the quick reply:
Gob is not supported. Heka's native, most efficient serialization
mechanism is protocol buffers. The simplest way to achieve what you
want is to use TcpOutputs (with `use_framing` set to true and a
ProtobufEncoder, which are the TcpOutput default settings) on the
edge nodes, and a TcpInput (with a HekaFramingSplitter and a
ProtobufDecoder, which again are the defaults) on the aggregator. If
you'd like a more robust transport, you could consider switching to
AMQP, Kafka, or NSQ, but those of course require running an
additional service.
protocol buffers should be fine for now and TcpInput would also work.
What happens if the network is down for a day and the log files have
rotated:
syslog.log.1.gz
syslog.log.2.gz
syslog.log.3.gz
syslog.log.4.gz
syslog.log
Is the heka agent smart enought to figure out where it left of and push
the new messages out?
There are two parts to this. First there's the LogstreamerInput parsing
side. If the Heka that's loading the log files goes down, and the files
rotate underneath Heka while it's down, Heka should notice this, scan
through the files to find the actual place it left off, and pick up from
there.
The second part is the TcpOutput part, which does actually run all
output messages through a disk buffer, maintaining a checkpoint into the
buffer to know which messages have been delivered. If the connection is
broken, messages will accumulate in the buffer until the connection
comes back, at which point it will pick up where it left off and start
draining the buffer.
It's important to realize that these two parts are not tightly coupled.
The LogstreamerInput keeps track of where it is in the log stream. The
TcpOutput keeps track of the messages that it has sent. The TcpOutput
knows nothing about the log stream... it neither knows nor cares much
about where the messages that it's sending came from. If the TCP
connection goes down, the LogstreamerInput will continue processing the
log files as they're written, but the TcpOutput will be buffering the
messages until the uplink comes back.
What happens if tcp input cannot connect. Will it
timeout and then keep retrying?
Yes, it will keep trying until the link comes back, or until the buffer
hits a configurable max buffer size, at which point it will either start
dropping messages, apply back pressure to the entire pipeline by
blocking the router, or cause Heka to shut down, depending on your
configuration.
Is a round robin between servers
supported? or a backup heka server if it cannot connect to a primary one?
No, neither of these are supported at this time. Currently you'd need to
either do this at the network level, or use an alternate transport.
Absolutely. If you set up the appropriate decoders on the edge
nodes, as a part of the LogstreamerInput config, then the Heka
messages passed from the edge nodes to the aggregator will
contain the parsed data encoded in the message fields. If you
don't do the decoding on the edge, then the messages will
contain the unparsed data in the message payload, and you'll
need to parse them on the aggregator. Note that this will
require a MultiDecoder, because you'll first need to decode from
protobuf, and *then* you'll need to parse the payload of the
decoded message.
Great. This would be the great to do this at the agents.
Great, just making sure you know the overall sitch. Although I
should clarify that, while it's possible to push new
SandboxFilters to a correctly configured Heka instance without
needing a restart, deploying any other sandboxed plugin type, or
changing the code underneath a filter that came from the config
(rather than being dynamically injected) *will* require a
restart. You're correct that you won't need to redeploy Heka
itself, however.
I guess making the agent smarter where you can figure what the user/host
should be watching and getting the lua scripts for those files out the
heka agent is the customization that I would need to add. It's just nice
to know that is possible. When we get that far along, I'll reach out to
this list again with more detailed questions.
Good luck,
-r
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka