I entered METRON–1453 <https://issues.apache.org/jira/browse/METRON-1453> a little while ago while working on the PR#579 <https://github.com/apache/metron/pull/579>.
"We have several parsers now, with many imaginable that are based on syslog, where the format is SYSLOG HEADER MESSAGE. With message being in a different format. It would be great is we had a way to generically handle syslog headers, such that ANY parser data could come over syslog. Either you could have a custom parser, or configure CSV or JSON such that they could be the payload, such that you can handle JSON over syslog by configuration only." The idea would be that the parser bolt would use the configuration to trigger parsing the incoming message as syslog formatted, and pass the message part to the parser, and put the syslog parts in the message(s) after parsing. As part of this I did some work on parsing syslog, using both grok and a DSL that I did from the spec : https://github.com/ottobackwards/grok-v-antlr The DSL is slower, but grok cannot handle multiple structured data entries, and the DSL can. I’m not good enough at grok to fix it so that it is functionally equivalent. Another option would be to write a third parser… It is also possible that the DSL could be improved for speed of course. Thoughts?