Hi Soumitra,
We're working on that. The Idea here is to use Kafka to get brokers'
information of the topic and use Kafka client to find coresponding offsets
on new cluster (
https://jeqo.github.io/post/2017-01-31-kafka-rewind-consumers-offset/). You
need kafka >=0.10.1.0 because it supports timestamp-based index.

2017-03-28 5:24 GMT+07:00 Soumitra Johri <soumitra.siddha...@gmail.com>:

> Hi, did you guys figure it out?
>
> Thanks
> Soumitra
>
> On Sun, Mar 5, 2017 at 9:51 PM nguyen duc Tuan <newvalu...@gmail.com>
> wrote:
>
>> Hi everyone,
>> We are deploying kafka cluster for ingesting streaming data. But
>> sometimes, some of nodes on the cluster have troubles (node dies, kafka
>> daemon is killed...). However, Recovering data in Kafka can be very slow.
>> It takes serveral hours to recover from disaster. I saw a slide here
>> suggesting using multiple data centers (https://www.slideshare.net/
>> HadoopSummit/building-largescale-stream-infrastructures-across-
>> multiple-data-centers-with-apache-kafka). But I wonder, how can we
>> detect the problem and switch between datacenters in Spark Streaming? Since
>> kafka 0.10.1 support timestamp index, how can seek to right offsets?
>> Are there any opensource library out there that supports handling the
>> problem on the fly?
>> Thanks.
>>
>

Reply via email to