On Mon, Apr 6, 2015 at 1:31 PM Rob Miller <[email protected]> wrote:
> This would work, and may be a way to get started, but it is suboptimal for > a few reasons: > > * PayloadRegexDecoder is a convenient way to get started for folks who are > unfamiliar with using grammars, but it is generally slower, less flexible, > and less composable / reusable than LPEG. I think the time spent writing > regular expressions to parse your logs would be better spent learning to > use grammars. > > * MultiDecoder only supports running all of the registered decoders in > sequence (not at all suitable for this use case) or cascading through them > all such that the first successful decoder wins. The latter choice can be > made to work, but clearly it's pretty inefficient when there are more than > 2 or 3 decoders to choose from. > > * Due to the mem copies required when transferring data between Go and C, > there is a small performance cost whenever you cross a sandbox boundary. > This is small enough to still allow for reasonably good throughput in most > cases, but if you have a MultiDecoder chaining multiple SandboxDecoders > together you'll end up crossing that boundary many times in rapid > succession, which certainly will burn cycles unnecessarily, and might slow > things down more than is acceptable. > > We've considered adding some sort of routing to the MultiDecoder, which > would allow you to look at input data and decide which decoder should > receive it based on arbitrary conditions, but that's not yet in place. > > The best solution for this right now would be to do all of the work in a > single SandboxDecoder. If you look at the various sandbox-based decoders > that Heka provides, you'll see that most of the heavy lifting isn't done in > the decoder code itself, but is delegated to Lua modules that we provide. > Similarly, custom grammars can be added to an existing Heka installation as > Lua modules. That way the main decoder code could use `read_message` calls > to examine the input data, decide what type of message has been received, > and invoke the appropriate parsing grammar for each one. > > Whether it's worth it to you to set this up probably depends on the amount > of data you need to process. If the MultiDecoder solution works, great, but > keep in mind that if you start to need more throughput that you can evolve > your system to meet the need. > > Hope this helps! > Yep! Helps a lot. Thanks, Rob (and Tom). I'll be working on Lua tomorrow. Cheers, Ali
_______________________________________________ Heka mailing list [email protected] https://mail.mozilla.org/listinfo/heka

