Joey Echeverria created FLUME-1173:
--------------------------------------

             Summary: HDFSEventSink can leave orphaned .tmp files
                 Key: FLUME-1173
                 URL: https://issues.apache.org/jira/browse/FLUME-1173
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
    Affects Versions: v1.1.0
            Reporter: Joey Echeverria


Currently HDFSEventSink only renames a .tmp file under the following conditions:

1) An attempt to write an event to the file coupled with hitting one of the 
three roll criteria
2) Stopping the HDFSEventSink closes all writers and thus renames all currently 
open .tmp files
3) If the number of max open files is hit, oler writers are closed, and thus 
their .tmp files get renamed

The problem that I see is if events are being routed by a path by timestamp, 
say day or hour, you should stop seeing any events written to that path after 
that timestamp has been hit. If this last event comes at an inopportune time, 
say 5 minutes after the last roll and you're rolling once an hour, then you 
could be left with an orphan .tmp file that won't get rolled until (2) or (3) 
hit. Unless you set the max number of open files low, that could be quite a 
long time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to