[
https://issues.apache.org/jira/browse/FLUME-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475391#comment-13475391
]
Mike Percy commented on FLUME-1350:
-----------------------------------
Flume is not designed to close the file until a certain event or condition
triggers the close. Please also remember that Flume's file path timestamps are
event-oriented.
The following operations will close the file:
1. rollInterval != 0: will close the file after N seconds have passed since the
first event was written to it
2. rollCount != 0: will close the file after N events have been written to it
3. rollSize != 0: will close the file after N body bytes have been written to it
4. The number of open files increases beyond maxOpenFiles
5. Flume shuts down
Have you met any of these conditions in the files that are remaining open?
Based on rollInterval = 21600, you will not close the file automatically until
6 hours has passed after the first event is written to the file.
> HDFS file handle not closed properly when date bucketing
> ---------------------------------------------------------
>
> Key: FLUME-1350
> URL: https://issues.apache.org/jira/browse/FLUME-1350
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v1.1.0, v1.2.0
> Reporter: Robert Mroczkowski
> Attachments: HDFSEventSink.java.patch
>
>
> With configuration:
> agent.sinks.hdfs-cafe-access.type = hdfs
> agent.sinks.hdfs-cafe-access.hdfs.path =
> hdfs://nga/nga/apache/access/%y-%m-%d/
> agent.sinks.hdfs-cafe-access.hdfs.fileType = DataStream
> agent.sinks.hdfs-cafe-access.hdfs.filePrefix = cafe_access
> agent.sinks.hdfs-cafe-access.hdfs.rollInterval = 21600
> agent.sinks.hdfs-cafe-access.hdfs.rollSize = 10485760
> agent.sinks.hdfs-cafe-access.hdfs.rollCount = 0
> agent.sinks.hdfs-cafe-access.hdfs.txnEventMax = 1000
> agent.sinks.hdfs-cafe-access.hdfs.batchSize = 1000
> #agent.sinks.hdfs-cafe-access.hdfs.codeC = snappy
> agent.sinks.hdfs-cafe-access.hdfs.hdfs.maxOpenFiles = 5000
> agent.sinks.hdfs-cafe-access.channel = memo-1
> When new directory is created previous file handle remains opened.
> rollInterval setting is used only with files in current date bucket.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira