[
https://issues.apache.org/jira/browse/FLUME-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266933#comment-13266933
]
Joey Echeverria commented on FLUME-1173:
----------------------------------------
I'm willing to let someone claim 'won't fix' if we update the documentation to
tell you to set max number of open files to a low value if you're partitioning
events by time. A better solution, in my book, would be to have a timer task
run once every roll interval seconds to see if one of the roll criteria has
been met. This means that you should see rolls after the last event has come
but before you have to close the writer due to the number of open files.
> HDFSEventSink can leave orphaned .tmp files
> -------------------------------------------
>
> Key: FLUME-1173
> URL: https://issues.apache.org/jira/browse/FLUME-1173
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v1.1.0
> Reporter: Joey Echeverria
>
> Currently HDFSEventSink only renames a .tmp file under the following
> conditions:
> 1) An attempt to write an event to the file coupled with hitting one of the
> three roll criteria
> 2) Stopping the HDFSEventSink closes all writers and thus renames all
> currently open .tmp files
> 3) If the number of max open files is hit, oler writers are closed, and thus
> their .tmp files get renamed
> The problem that I see is if events are being routed by a path by timestamp,
> say day or hour, you should stop seeing any events written to that path after
> that timestamp has been hit. If this last event comes at an inopportune time,
> say 5 minutes after the last roll and you're rolling once an hour, then you
> could be left with an orphan .tmp file that won't get rolled until (2) or (3)
> hit. Unless you set the max number of open files low, that could be quite a
> long time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira