Anup, Cross posting this to users since it is a great user question.
That answer is: Absolutely. So couple of details to iron out to get started. I'll ask the question and explain why. First some background: - Kafka wants the small events themselves ideally. - HDFS wants those events bundled together typically along whatever block size you have in HDFS. The questions: - Where is this 1TB dataset living today? This will help determine best way to pull the dataset in. - What is the current nature of the dataset? Is it already in large bundles as files or is it a series of tiny messages, etc..? Does it need to be split/merged/etc.. - What is the format of the data? Is it something that can easily be split/merged or will it require special processes to do so? These are good to start with. Thanks Joe On Tue, Jun 2, 2015 at 10:41 AM, anup s <[email protected]> wrote: > Suppose, I have 1 TB of data that I need to backup/sync to a HDFS location > and then be passed onto a Kafka, is there a way out to do that? > > > > > -- > View this message in context: > http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/Fetch-change-list-tp1351p1706.html > Sent from the Apache NiFi (incubating) Developer List mailing list archive at > Nabble.com.
