On 11/16/2015 12:31 PM, web user wrote:
Thanks again for the quick reply:

    Gob is not supported. Heka's native, most efficient serialization
    mechanism is protocol buffers. The simplest way to achieve what you
    want is to use TcpOutputs (with `use_framing` set to true and a
    ProtobufEncoder, which are the TcpOutput default settings) on the
    edge nodes, and a TcpInput (with a HekaFramingSplitter and a
    ProtobufDecoder, which again are the defaults) on the aggregator. If
    you'd like a more robust transport, you could consider switching to
    AMQP, Kafka, or NSQ, but those of course require running an
    additional service.


protocol buffers should be fine for now and TcpInput would also work.
What happens if the network is down for a day and the log files have
rotated:

syslog.log.1.gz
syslog.log.2.gz
syslog.log.3.gz
syslog.log.4.gz
syslog.log

Is the heka agent smart enought to figure out where it left of and push
the new messages out?

There are two parts to this. First there's the LogstreamerInput parsing side. If the Heka that's loading the log files goes down, and the files rotate underneath Heka while it's down, Heka should notice this, scan through the files to find the actual place it left off, and pick up from there.

The second part is the TcpOutput part, which does actually run all output messages through a disk buffer, maintaining a checkpoint into the buffer to know which messages have been delivered. If the connection is broken, messages will accumulate in the buffer until the connection comes back, at which point it will pick up where it left off and start draining the buffer.

It's important to realize that these two parts are not tightly coupled. The LogstreamerInput keeps track of where it is in the log stream. The TcpOutput keeps track of the messages that it has sent. The TcpOutput knows nothing about the log stream... it neither knows nor cares much about where the messages that it's sending came from. If the TCP connection goes down, the LogstreamerInput will continue processing the log files as they're written, but the TcpOutput will be buffering the messages until the uplink comes back.

What happens if tcp input cannot connect. Will it
timeout and then keep retrying?

Yes, it will keep trying until the link comes back, or until the buffer hits a configurable max buffer size, at which point it will either start dropping messages, apply back pressure to the entire pipeline by blocking the router, or cause Heka to shut down, depending on your configuration.

Is a round robin between servers
supported? or a backup heka server if it cannot connect to a primary one?

No, neither of these are supported at this time. Currently you'd need to either do this at the network level, or use an alternate transport.

        Absolutely. If you set up the appropriate decoders on the edge
        nodes, as a part of the LogstreamerInput config, then the Heka
        messages passed from the edge nodes to the aggregator will
        contain the parsed data encoded in the message fields. If you
        don't do the decoding on the edge, then the messages will
        contain the unparsed data in the message payload, and you'll
        need to parse them on the aggregator. Note that this will
        require a MultiDecoder, because you'll first need to decode from
        protobuf, and *then* you'll need to parse the payload of the
        decoded message.


Great. This would be the great to do this at the agents.

        Great, just making sure you know the overall sitch. Although I
        should clarify that, while it's possible to push new
        SandboxFilters to a correctly configured Heka instance without
        needing a restart, deploying any other sandboxed plugin type, or
        changing the code underneath a filter that came from the config
        (rather than being dynamically injected) *will* require a
        restart. You're correct that you won't need to redeploy Heka
        itself, however.



I guess making the agent smarter where you can figure what the user/host
should be watching and getting the lua scripts for those files out the
heka agent is the customization that I would need to add. It's just nice
to know that is possible. When we get that far along, I'll reach out to
this list again with more detailed questions.


Good luck,

-r
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Reply via email to