Atul, If you're finding distcp to be a good enough tool for the job then I would definitely use it. For the things distcp works for it is very stable/very to the point/and if your use case fits squarely inside there you'll be in a good position.
If you need more than direct file replication from one environment to another that is when a flow management solution might make sense to bring into the picture. You'd have to share a lot more details about your nifi configuration for anyone to help comment on duplicates. It should be extremely unlikely for duplicates to occur in a fairly straightforward configuration. Raw performance wise distcp doesn't make an intermediary copy of the data like NiFi would so it *should* be a bit faster in a pure sense for this particular use case but then see above comments. Thanks On Wed, Mar 21, 2018 at 12:14 PM, atul gupta <guptaat...@gmail.com> wrote: > Hi team , > > We are using nifi copy framework to copy data across clusters but are > facing issues in terms of duplicates ( can nifi stops or check if table is > already present in the destination cluster ?) . Moreover if volume is more > than 1 tb we are finding distcp a better option as nifi flow is taking more > time or failing .Please advise . > > Thanks & Regards, > Atul Gupta