[ 
https://issues.apache.org/jira/browse/FLUME-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488532#comment-13488532
 ] 

Mike Percy commented on FLUME-1665:
-----------------------------------

Yes, Flume may create duplicates. But the goal is not to create any under 
normal conditions... Definitely less duplicates is better. But correctness and 
reliability are more important.

Example of slowness: Maybe you have a 50-megabyte data transfer transaction 
over a slow network link, or you are operating a file channel on an overwhelmed 
disk with a large batch of large events, or you hit a Hadoop GC when writing to 
HDFS... in such cases, a multi-second delay is not difficult to achieve.

                
> Data from FileChannel will be duplicated when restarting configuration
> ----------------------------------------------------------------------
>
>                 Key: FLUME-1665
>                 URL: https://issues.apache.org/jira/browse/FLUME-1665
>             Project: Flume
>          Issue Type: Bug
>          Components: Channel
>    Affects Versions: v1.2.0, v1.3.0
>            Reporter: Denny Ye
>              Labels: FileChannel
>
> When Flume process was running, I changed configuration property and Flume 
> rebooted without process restarting. Events will be duplicated in next loop, 
> also has been consumed before all components have stopped. 
> I found the root cause. When FileChannel was stopping, it should save the 
> 'inflightPuts' and 'inflightTakes' into disk for preparing in next loop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to