Hello Team,

I am new to Apache Nifi and started working on it. Currently we have Nifi 
installed in cluster and that has three nodes.


I am facing duplicate data while implementing below use-case,


Use Case : I need to fetch data from US 
Earthquake(http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_hour.geojson)
 website and load data in incremental way.


For that I am using below processors,

*         InvokeHTTP

*         SplitJson

*         EvaluateJsonPath

*         ReplaceText

*         MergerContent

*         PutFile/PutHDFS

I haI have attached my template as well.
Issue which I am facing is duplicate data because every time InvokeHttp hit to 
API and get the available details. But it might fetch the existing data as well 
so it load same data again in Target.

I need to load only unique data into taget. I found DetectDuplicate but not 
know how to configure it, Can you tell me how to configure services in cluster. 
Or itf you fhave any other solution so please let me know. We want to use Nifi 
in our upcoming project but facing issue while implementing  small POCs.


ThaThanks

YogYogesh (+91-9689942310)

Tha


If

I






Reply via email to