Combination of ExecSource and HDFS DataStream is removing end line characters
-----------------------------------------------------------------------------
Key: FLUME-860
URL: https://issues.apache.org/jira/browse/FLUME-860
Project: Flume
Issue Type: Bug
Affects Versions: NG alpha 2
Reporter: Jarek Jarcec Cecho
I've noticed that combination of ExecSource and HDFS Sink configured to use
DataStream is removing end line characters and thus is creating one line output
file.
I've used two centos boxes where first was acting as an agent, reading local
log file using ExecSource. Second machine was acting as a collector, waiting
for input events and storing them on HDFS. Both machines were connected using
AVRO sink+source combination. You can find configuration files for both
machines with their logs as well attached to this JIRA bug.
I've executed both flume-ng instances using following commands:
./bin/flume-ng node --conf conf/ --classpath flume-ng.jar --f
conf/configuration.properties -n hddev01 > hddev01.log 2>&1
./bin/flume-ng node --conf conf/ --classpath flume-ng.jar --f
conf/configuration.properties -n hddev02 > hddev02.log 2>&1
Input file was created using following small bash script (it was executed after
flume-ng was successfully started):
for i in `seq -w 01 10`; do echo $i; echo Yoda-$i >> /var/log/jarcec; sleep 1s;
done
Please note that I had to apply patch from FLUME-858 in order to get DataStream
file type working.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira