David Burgos. Isban (Banco Santander) created FLUME-2800:
------------------------------------------------------------
Summary: Multiline log events for Taildir Source
Key: FLUME-2800
URL: https://issues.apache.org/jira/browse/FLUME-2800
Project: Flume
Issue Type: Improvement
Components: Sinks+Sources
Affects Versions: v1.6.0, v1.7.0
Reporter: David Burgos. Isban (Banco Santander)
Priority: Minor
This a proposal of implementation to handle multiline log messages for new
tailing source FLUME-2498.
Based on an idea FLUME-2779 MultiLine Deserializer for Spooling DIrectory
Source.
Config.
* multiLineRegex: Regular expression to handle multiline log messages (grok
expressions permitted)
* grokDictionaryDir: Custom Grok dictionaries directory
* maxNumberLines: Max number of lines per event in multiline log messages.
Default 100. Remaining lines is never transferred to sink.
For Regex expressions use joni regex engine which can be twice as fast as the
Java regex engine and will be more efficient, producing less object churn while
scanning, because it operates natively on byte arrays.
https://github.com/jruby/joni
Include a functionality for extracting grok expressions into a pure named regex
(inspired by the logstash inteceptor)
By default load the included built-in grok dictionaries with pre-defined
patterns.
https://github.com/aicer/grok
Attached patch includes a config documentation and unit tests.
Also attached a completed port/patch for Flume 1.6 a Java 1.6
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)