Thanks for the reply
I am using flume-0.9.4-cdh3u3*
*I am using hbase() sink
Any node is neither getting restarted nor reconfigured,
Their is no delay (as I am using single node setup)

Still I am getting same problem: continuously data is transmitted from
source to HBase table (I can see this on console)





On Thu, Jun 7, 2012 at 7:40 PM, JS Jang <jsj...@gruter.com> wrote:

>  Hello Pahul,
>
> There can be some chances of data duplication like:
> 1. In End-to-End mode, delayed acks from master can cause data re-transit
> by agent.
> 2. tail source in agent node you used reads data from start of the file if
> you don't specify and set startFromEnd parameter to true; whenever you
> re-configure or restart the logical node somehow, it tails from the start
> of file. if you do multi-config, it may cause "refreshAll" of all nodes the
> master cares.
> 3. there was a bug that sometimes same logical nodes were started twice in
> 0.9.4 github version, which fixed in latest version of cdh3
>
> to test, for 1, you can try agentBESink instead of agentSink which is as
> far as I know same as agentE2ESink
> for 2, you can try setting startFromEnd parameter to true.
>
> hope it would be helpful.
>
> JS
>
>
> On 6/7/12 10:49 PM, Rahul Patodi wrote:
>
>  I have configured flume sink hbase(),
> I have managed it to work
> data is getting copy from a file on local hard disk to hbase
>
> but
>
> *Same data is getting copy again and again to hbase table (I saw this by
> using VERSIONS)*
> (I am not doing any changes in source file)
>
> my configurations:
> Collector Source: collectorSource(35853)
> Collector Sink: {regexAll("(\\w+)\\t+(\\w+)\\t+(\\w+)", "row", "data1",
> "data2") => hbase("ft02", "%{row}", "cf1", "col", "%{data1}", "cf2",
> "coll", "%{data2}")}
>
> Agent Source: tail("/tmp/test03")
> Agent Sink: agentSink("localhost",35853)
>
>
> Any Help is appreciated......!!!!!!!
>
> --
> *Regards*,
> Rahul Patodi
>
>
>
> --
> ----------------------------
> 장정식 / jsj...@gruter.com
> (주)그루터, R&D팀 수석www.gruter.com
> Cloud, Search and Social
> ----------------------------
>
>

Reply via email to