[
https://issues.apache.org/jira/browse/FLUME-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Confuse updated FLUME-3351:
---------------------------
Description:
If the server restarts abnormally, taildir source may read data repeatedly.
It's easy to replicate this phenomenon,such as using the command: reboot.
below is my recurrence scenario:
Agent one is deployed on server one, and it is configured taildir source, file
channel, avro sink. While agent two is deployed on server two, and agent two
is configured avro source, file channel, hdfs sink. This two agents are
connected by avro. It means agent two receives data from agent one. Then i
reboot server one, data on HDFS must be repeated after server one recovery from
failure.
was:
If the server restarts abnormally, taildir source may read data repeatedly.
It's easy to replicate this phenomenon,such as using the command: reboot.
below is my recurrence scenario:
Agent one is deployed on server one, and it is configured taildir source, file
channel, avro sink. While agent two is deployed on server two, and agent two
is configured avro source, file channel, hdfs sink. Thistwo agents are
connected by avro. It means agent two receives data from agent one. Then i
reboot server one, data on HDFS must be repeated after server one recovery from
failure.
> Taildir source data duplication
> -------------------------------
>
> Key: FLUME-3351
> URL: https://issues.apache.org/jira/browse/FLUME-3351
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: 1.9.0
> Reporter: Confuse
> Priority: Major
>
> If the server restarts abnormally, taildir source may read data repeatedly.
> It's easy to replicate this phenomenon,such as using the command: reboot.
> below is my recurrence scenario:
> Agent one is deployed on server one, and it is configured taildir source,
> file channel, avro sink. While agent two is deployed on server two, and
> agent two is configured avro source, file channel, hdfs sink. This two
> agents are connected by avro. It means agent two receives data from agent
> one. Then i reboot server one, data on HDFS must be repeated after server one
> recovery from failure.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]