Eric created FLUME-3083:
---------------------------

             Summary: Taildir source can miss events if last updated time in 
same second as file mtime
                 Key: FLUME-3083
                 URL: https://issues.apache.org/jira/browse/FLUME-3083
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
    Affects Versions: 1.7.0
            Reporter: Eric


I suspect there is a scenario where the taildir source can miss reading events 
from a log file due to how the source determines whether a file has been 
updated. In ReliableTaildirEventReader:

{code}
boolean updated = tf.getLastUpdated() < f.lastModified()
...
tf.setNeedTail(updated);
{code}

Consider this sequence of events from TaildirSource.process(). Assume they all 
happen within the same second and there is just a single log file.

# Call ReliableTaildirEventReader.updateTailFiles()
#* This call will set ReliableTaildirEventReader.updateTime to current time in 
milliseconds
#* Assume the underlying file has not been updated within the last idleTimeout 
milliseconds
# Due to idleness, the tail file's inode is added to TaildirSource.idleInodes 
in idleFileCheckerRunnable
# tf.needTail is false. Skip reading file.
# Underlying file is updated with events E1
# TaildirSource.closeTailFiles()
#* Call TaildirSource.tailFileProcess() before close to read any pending events
#* Events E1 are read and processed
#* Since events were read, call ReliableTaildirEventReader.commit() which 
updates the tail file's position and sets its last updated time to 
ReliableTaildirEventReader.updateTime from 1.a
#* Close file
# Events E2 are written to underlying file. File's modification time is in the 
same second as the tail file's last updated time.
# Since the time returned by File.lastModified() is the mtime in seconds 
converted to milliseconds the file's last modified time is less than the tail 
file's last updated time and taildir won't reopen the file to read E2.
#* This behaviour of File.lastModified() may be platform/jvm specific. I 
confirmed the behavior using OpenJDK 8 on Ubuntu precise.  

Can someone confirm this?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to