Will McQueen created FLUME-1308:
-----------------------------------

             Summary: HDFS Sink throws DFSOutputStream when exception when 
maxOpenFiles=1
                 Key: FLUME-1308
                 URL: https://issues.apache.org/jira/browse/FLUME-1308
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
    Affects Versions: v1.2.0
         Environment: RHEL 6.2 64-bit
            Reporter: Will McQueen
             Fix For: v1.2.0


When I set the HDFS sink to have maxOpenFiles=1, then 2 things happen:
1) Events propagate very slowly to HDFS
2) Events are repeated (eg, after a while the same 100 or so events appear 
repeatedly in HDFS where each event should have unique payload per the test I'm 
running).

Steps:
1) Launch 2 avro clients targetting a single avro source whose associated 
channel is a file channel (also tried memory channel.. had same issue)
2) View the logs, and you're likely to see:

2012-06-21 16:27:34,106 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: DFSOutputStream is closed
        at 
org.apache.hadoop.hdfs.DFSOutputStream.isClosed(DFSOutputStream.java:1193)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1453)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.sync(DFSOutputStream.java:1437)
        at 
org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:116)
        at 
org.apache.flume.sink.hdfs.HDFSDataStream.sync(HDFSDataStream.java:95)
        at 
org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:276)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to