What do you mean by streaming way? The logic to push to S3 will be in your consumer, so it totally depends on how you want to read and store. I think that's an easier way to do what you want to, instead of trying to backup kafka and then read messages from there. Not even sure that's possible.
On Tue, Dec 6, 2016 at 5:11 PM, Aseem Bansal <asmbans...@gmail.com> wrote: > I get that we can read them and store them in batches but is there some > streaming way? > > On Tue, Dec 6, 2016 at 5:09 PM, Aseem Bansal <asmbans...@gmail.com> wrote: > > > Because we need to do exploratory data analysis and machine learning. We > > need to backup the messages somewhere so that the data scientists can > > query/load them. > > > > So we need something like a router that just opens up a new consumer > group > > which just keeps on storing them to S3. > > > > On Tue, Dec 6, 2016 at 5:05 PM, Sharninder Khera <sharnin...@gmail.com> > > wrote: > > > >> Why not just have a parallel consumer read all messages from whichever > >> topics you're interested in and store them wherever you want to? You > don't > >> need to "backup" Kafka messages. > >> > >> _____________________________ > >> From: Aseem Bansal <asmbans...@gmail.com> > >> Sent: Tuesday, December 6, 2016 4:55 PM > >> Subject: Storing Kafka Message JSON to deep storage like S3 > >> To: <users@kafka.apache.org> > >> > >> > >> Hi > >> > >> Has anyone done a storage of Kafka JSON messages to deep storage like > S3. > >> We are looking to back up all of our raw Kafka JSON messages for > >> Exploration. S3, HDFS, MongoDB come to mind initially. > >> > >> I know that it can be stored in kafka itself but storing them in Kafka > >> itself does not seem like a good option as we won't be able to query it > >> and > >> the configurations of machines containing kafka will have to be > increased > >> as we go. Something like S3 we won't have to manage. > >> > >> > >> > >> > >> > > > > > -- -- Sharninder