This would work, and may be a way to get started, but it is suboptimal for a few reasons:
* PayloadRegexDecoder is a convenient way to get started for folks who are unfamiliar with using grammars, but it is generally slower, less flexible, and less composable / reusable than LPEG. I think the time spent writing regular expressions to parse your logs would be better spent learning to use grammars. * MultiDecoder only supports running all of the registered decoders in sequence (not at all suitable for this use case) or cascading through them all such that the first successful decoder wins. The latter choice can be made to work, but clearly it's pretty inefficient when there are more than 2 or 3 decoders to choose from. * Due to the mem copies required when transferring data between Go and C, there is a small performance cost whenever you cross a sandbox boundary. This is small enough to still allow for reasonably good throughput in most cases, but if you have a MultiDecoder chaining multiple SandboxDecoders together you'll end up crossing that boundary many times in rapid succession, which certainly will burn cycles unnecessarily, and might slow things down more than is acceptable. We've considered adding some sort of routing to the MultiDecoder, which would allow you to look at input data and decide which decoder should receive it based on arbitrary conditions, but that's not yet in place. The best solution for this right now would be to do all of the work in a single SandboxDecoder. If you look at the various sandbox-based decoders that Heka provides, you'll see that most of the heavy lifting isn't done in the decoder code itself, but is delegated to Lua modules that we provide. Similarly, custom grammars can be added to an existing Heka installation as Lua modules. That way the main decoder code could use `read_message` calls to examine the input data, decide what type of message has been received, and invoke the appropriate parsing grammar for each one. Whether it's worth it to you to set this up probably depends on the amount of data you need to process. If the MultiDecoder solution works, great, but keep in mind that if you start to need more throughput that you can evolve your system to meet the need. Hope this helps! -r On 04/06/2015 09:56 AM, Ali wrote:
Ah-ha! Should I use a combination of MultiDecoder and PayloadRegexDecoder (for custom formats)? And just assign the MultiDecoder to the TcpInput? -Ali On Mon, Apr 6, 2015 at 11:49 AM Ali <[email protected] <mailto:[email protected]>> wrote: Morning, all! I'm trying out nxlog on remote hosts and having nxlog send logs to my Heka host's TcpInput. However, I'm starting to add multiple types of log data (syslog files, Apache logs, Tomcat logs) to the nxlog forwarder and I'm wondering how best to handle this. Should I configure Heka to use a single TcpInput for all of these different message types? Should I configure a separate TcpInput for each distinct message type? Something else? TIA, Ali _______________________________________________ Heka mailing list [email protected] https://mail.mozilla.org/listinfo/heka
_______________________________________________ Heka mailing list [email protected] https://mail.mozilla.org/listinfo/heka

