Joey Echeverria created FLUME-1173:
--------------------------------------
Summary: HDFSEventSink can leave orphaned .tmp files
Key: FLUME-1173
URL: https://issues.apache.org/jira/browse/FLUME-1173
Project: Flume
Issue Type: Bug
Components: Sinks+Sources
Affects Versions: v1.1.0
Reporter: Joey Echeverria
Currently HDFSEventSink only renames a .tmp file under the following conditions:
1) An attempt to write an event to the file coupled with hitting one of the
three roll criteria
2) Stopping the HDFSEventSink closes all writers and thus renames all currently
open .tmp files
3) If the number of max open files is hit, oler writers are closed, and thus
their .tmp files get renamed
The problem that I see is if events are being routed by a path by timestamp,
say day or hour, you should stop seeing any events written to that path after
that timestamp has been hit. If this last event comes at an inopportune time,
say 5 minutes after the last roll and you're rolling once an hour, then you
could be left with an orphan .tmp file that won't get rolled until (2) or (3)
hit. Unless you set the max number of open files low, that could be quite a
long time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira