[
https://issues.apache.org/jira/browse/FLUME-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815317#comment-13815317
]
BitsOfInfo commented on FLUME-1988:
-----------------------------------
My bad, fixed my morphline.conf (to readMultiLine) and it works fine w/ the
RegexDelimiterDeSerializer attached to this ticket
agent.sources = src1
agent.channels = memoryChannel
agent.sinks = loggerSink
# For each one of the sources, the type is defined
agent.sources.src1.type = spooldir
agent.sources.src1.channels = memoryChannel
agent.sources.src1.spoolDir = /Users/me/Documents/bb/flume/mylogs
agent.sources.src1.deserializer = REGEX
agent.sources.src1.deserializer.outputCharset = UTF-8
agent.sources.src1.deserializer.eventEndRegex = --[a-fA-F0-9]{8}-Z--
agent.sources.src1.deserializer.includeEventEndRegex = true
agent.sources.src1.interceptors = morphlineinterceptor
agent.sources.src1.interceptors.morphlineinterceptor.type =
org.apache.flume.sink.solr.morphline.MorphlineInterceptor$Builder
agent.sources.src1.interceptors.morphlineinterceptor.morphlineFile =
/Users/me/Documents/bb/flume/apache-flume-1.5.0-SNAPSHOT-bin/conf/mymorphline.conf
agent.sources.src1.interceptors.morphlineinterceptor.morphlineId = morphline1
morphlines : [
{
id : morphline1
importCommands : ["com.cloudera.**"]
commands : [
{
readMultiLine {
regex: ".*"
charset : UTF-8
}
}
# log the record at DEBUG level to SLF4J
{ logDebug { format : "output record: {}", args : ["@{}"] } }
]
}
]
> Add Support for Additional Deserializers for SpoolingDirectorySource
> --------------------------------------------------------------------
>
> Key: FLUME-1988
> URL: https://issues.apache.org/jira/browse/FLUME-1988
> Project: Flume
> Issue Type: New Feature
> Components: Docs, Sinks+Sources
> Affects Versions: v1.4.0
> Reporter: Israel Ekpo
> Assignee: Israel Ekpo
> Labels: serializers
> Attachments: EventDeserializerType.java,
> RegexDelimiterDeSerializer.java, ResettableTestStringInputStream.java,
> TestRegexDelimiterDeSerializer.java
>
>
> There are certain use cases for SpoolingDirectorySource where the events in
> the log file are not delimited with newline characters.
> Certain log files that contain stack traces, xml documents and pretty JSON
> strings seem to contain multiple new line characters within each event.
> We can use alternative logic such as specific characters, strings or regular
> expressions to determine when the event is complete.
> Hence I am proposing the following new deserializers based on
> org.apache.flume.serialization.LineDeserializer
> # org.apache.flume.serialization.RegexDelimiterDeSerializer
> Allows the user to specify a regular expression that is a delimiter for
> events within the log file
> # org.apache.flume.serialization.CharSequenceDelimiterDeSerializer
> Allows the user to specify a comma separated character sequence that is a
> delimiter for events within the log file
> The user will specify an integer for the ascii characters and we will use
> that as the delimter.
> For example support for \r\n could be specified as 13,10
> A list of codes is available at http://www.asciitable.com/
> We will also need to update the user guide with examples on how to configure
> and specify a custom deserializer.
--
This message was sent by Atlassian JIRA
(v6.1#6144)