What support does the new API provide for offset management?

On Tue, Nov 24, 2015 at 1:09 PM, Thomas Weise <[email protected]>
wrote:

> This discussion applies to Kafka 0.8.x only?
>
> With the new consumer API, offset management can be delegated, we won't
> need this component any longer. Each partition can record the offset for
> the committed window then.
>
> Thomas
>
> On Tue, Nov 24, 2015 at 12:59 PM, Siyuan Hua <[email protected]>
> wrote:
>
> > Hi all,
> >
> > I need your idea for the design of OffsetManager for kafka input operator
> >
> > First of all, some background of OffsetManager and why we may need it.
> The
> > OffsetManager is a plugin in kafka input operator for cutomized offset
> > management. The API will be called if the consumer offset(the message
> that
> > has been emitted, along with the window id) changes.
> > OffsetManager is different from offset checkpointing, we still use
> > checkpointed offset to recover node from failure.
> >
> > 2 reasons for the need of OffsetManager:
> >   1) User might want to store offsets in their own way (hdfs, zookeeper,
> > database, etc)
> >   2) User might want to continue consuming at application restart.
> >
> > In the current version, the OffsetManager works in a central mode, each
> > partition only reports the offset(s) to Statslistener, the listener calls
> > OffsetManager to update the offsets.
> > The other possibility is make the OffsetManager work in a distributed
> > mode.  Each partition update the offset(s) on its own.
> >
> > The distributed mode is more straightforward, but the developer needs to
> > know it's distributed, you have to manage write from multiple nodes,
> > collisions on your own. But also it's more real time at no risk of
> failure
> > of stats reporter
> >
> > Any input is welcome, thanks!
> >
> > Regards,
> > Siyuan
> >
> >
> >
> >
> >
> > On Mon, Nov 16, 2015 at 10:53 AM, Siyuan Hua <[email protected]>
> > wrote:
> >
> > > I will be working on rewriting the kafka input operator.
> > >
> > > Here is the ticket
> > > https://malhar.atlassian.net/browse/MLHR-1904
> > >
> > > Here is some comments on the ticket
> > >
> > > The RC2 is out here
> > > https://people.apache.org/~junrao/kafka-0.9.0.0-candidate2/
> > >
> > > We will keep most features of the old input operator but the internal
> > > mechanism will be changed, for example, using new API to refresh the
> > > metadata
> > > The bugs that will be fixed:
> > >
> > >    - Synchronized offset checkpoint
> > >    - Transient offsetmanager
> > >
> > > New features:
> > >
> > >    - Support customized partition schema
> > >    - Default OffsetManager using new
> > >
> > > Improvement
> > >
> > >    - Add window id and application name to OffsetManager interface
> > >    - Support multi-topic
> > >    - Easy configuration
> > >
> > >
> > > Please leave thoughts here or on the ticket. Thanks
> > >
> > > Best,
> > > Siyuan
> > >
> >
>

Reply via email to