Anup,

Cross posting this to users since it is a great user question.

That answer is: Absolutely.

So couple of details to iron out to get started.  I'll ask the
question and explain why.  First some background:
- Kafka wants the small events themselves ideally.
- HDFS wants those events bundled together typically along whatever
block size you have in HDFS.

The questions:
- Where is this 1TB dataset living today?  This will help determine
best way to pull the dataset in.

- What is the current nature of the dataset?  Is it already in large
bundles as files or is it a series of tiny messages, etc..?  Does it
need to be split/merged/etc..

- What is the format of the data?  Is it something that can easily be
split/merged or will it require special processes to do so?

These are good to start with.

Thanks
Joe


On Tue, Jun 2, 2015 at 10:41 AM, anup s <[email protected]> wrote:
> Suppose, I have 1 TB of data that I need to backup/sync to a HDFS location
> and then be passed onto a Kafka, is there a way out to do that?
>
>
>
>
> --
> View this message in context: 
> http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Fetch-change-list-tp1351p1706.html
> Sent from the Apache NiFi (incubating) Developer List mailing list archive at 
> Nabble.com.

Reply via email to