Combination of ExecSource and HDFS DataStream is removing end line characters
-----------------------------------------------------------------------------

                 Key: FLUME-860
                 URL: https://issues.apache.org/jira/browse/FLUME-860
             Project: Flume
          Issue Type: Bug
    Affects Versions: NG alpha 2
            Reporter: Jarek Jarcec Cecho


I've noticed that combination of ExecSource and HDFS Sink configured to use 
DataStream is removing end line characters and thus is creating one line output 
file.

I've used two centos boxes where first was acting as an agent, reading local 
log file using ExecSource. Second machine was acting as a collector, waiting 
for input events and storing them on HDFS.  Both machines were connected using 
AVRO sink+source combination. You can find configuration files for both 
machines with their logs as well attached to this JIRA bug.

I've executed both flume-ng instances using following commands:
./bin/flume-ng node --conf conf/ --classpath flume-ng.jar --f 
conf/configuration.properties -n hddev01 > hddev01.log 2>&1
./bin/flume-ng node --conf conf/ --classpath flume-ng.jar --f 
conf/configuration.properties -n hddev02  > hddev02.log 2>&1

Input file was created using following small bash script (it was executed after 
flume-ng was successfully started):
for i in `seq -w 01 10`; do echo $i; echo Yoda-$i >> /var/log/jarcec; sleep 1s; 
done

Please note that I had to apply patch from FLUME-858 in order to get DataStream 
file type working.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to