Basically we are using Zookeeper to coordinate between a producer and consumer. 
When the consumer comes up, it needs a recap from the producer. The producer 
sends this recap to the consumer through Kafka in chunks. Ideally we wanted the 
consumer to be able to jump back to the start of the last recap in the queue if 
the producer is down and the last recap was recent. I think we have come up 
with some other ways around this that don't rely on "seek" functionality, but 
was just wondering if anyone else had done something similar already. It seems 
that the new implementation you mentioned would provide this functionality.

From: user@storm.apache.org 
Subject: Re: Seek in KafkaSpout

I'm curious to your use case around this?  It seems odd to need to adjust it on 
the fly while a topology is running, or I've misunderstood you!

If you store your consumer state in Zookeeper, you CAN adjust it between 
topology deploys by manually modifying the stored state, and I've done this to 
deal w/ maintenance or service issues to roll back to a specific point in time. 
 Unsure if you're able to do this when consumer state is stored within Kafka 
itself.

As a side note, I've been toying with a Kafka spout implementation that allows 
dynamically consuming arbitrary ranges from topics that is to be open sourced 
here soon.

Stephen

On Fri, Sep 29, 2017 at 8:06 AM, Mitchell Rathbun (BLOOMBERG/ 731 LEX) 
<mrathb...@bloomberg.net> wrote:

Looking through the documentation, it seems that KafkaSpout does not expose any 
way to set the offset the spout reads from after the initial poll. This 
functionality is supported in KafkaConsumer through the seek() method. Am I 
correct that this isn't supported? Has anyone found a way to mimic the behavior 
of seek() with KafkaSpout?


Reply via email to