[
https://issues.apache.org/jira/browse/FLUME-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956446#comment-14956446
]
Jun Seok Hong commented on FLUME-2777:
--------------------------------------
In linux, getting the created time for a file is impossible.
Files.getAttribute(Paths.get("xx"), "basic:creationTime") returns
last-modified-time.
[https://docs.oracle.com/javase/7/docs/api/java/nio/file/attribute/BasicFileAttributes.html]
If the target file is modified after taildir starts, duplicate events will be
happened.
> Tail Dir Source leads to duplicate events on rolling the tailed file
> --------------------------------------------------------------------
>
> Key: FLUME-2777
> URL: https://issues.apache.org/jira/browse/FLUME-2777
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: notrack
> Reporter: Johny Rufus
> Assignee: Johny Rufus
> Attachments: FLUME-2777-1.patch, FLUME-2777.patch
>
>
> I have a simple setup, where I write 200 events to logfile1. [TailSrc is on
> the lookout for logfile* ]
> Then I rename logfile1 to logfile2.
> I create a new logfile1 and write 100 events to it.
> Typically I should see 300 events in my channel. But I see 500 events.
> I was able to trace the duplicates to ReliableTaildirEventReader.java
> updateFiles(boolean) to the way renamed files are handled , by specifying
> starting position as 0. [This starting position should be obtained from
> tf.getPosition()]
> I am attaching a proposed fix, would be great if one of you guys
> [~iijima_satoshi] / [~hshreedharan]/ [~roshan_naik] can take a look at the
> fix and validate the issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)