To avoid some consumers not consuming anything, one small trick might be
that if a consumer found itself not getting any partition, it can force a
rebalancing by deleting its own registration path and re-register in ZK.


On Mon, Jan 27, 2014 at 4:32 PM, David Birdsong <david.birds...@gmail.com>wrote:

> On Mon, Jan 27, 2014 at 4:19 PM, Guozhang Wang <wangg...@gmail.com> wrote:
>
> > Hello David,
> >
> > One thing about using ZK locks to "own" a partition is load balancing. If
> > you are unlucky some consumer may get all the locks and some may get
> none,
> > hence have no partitions to consume.
> >
>
> I've considered this and even encountered it in testing. For our current
> load levels, we won't hurt us, but if there's a good solution, I'd rather
> codify smooth consumer balance.
>
> Got any suggestions?
>
> My thinking thus far is to establish some sort of identity on the consumer
> and derive an evenness or oddness or some modulo value that induces a small
> delay when encountering particular partition numbers. It's a hacky idea,
> but is pretty simple and might be good enough for smoothing consumers.
>
>
> > Also you may need some synchronization between the consumer thread with
> the
> > offset thread. For example, when an event is fired and the consumers need
> > to re-try grabbing the locks, it needs to first stop current fetchers,
> > commit offsets, and then start owning new partitions.
> >
>
> This is current design and what I have implemented so far. The last thread
> to exit is the offset thread and it has a direct communication channel to
> the consumer threads so it waits for those channels to be closed before
> it's last flush and exit.
>
>
> > Guozhang
> >
> >
> Thanks for the input!
>
>
> >
> > On Mon, Jan 27, 2014 at 3:03 PM, David Birdsong <
> david.birds...@gmail.com
> > >wrote:
> >
> > > Hey All, I've been cobbling together a high-level consumer for golang
> > > building on top of Shopify's Sarama package and wanted to run the basic
> > > design by the list and get some feedback or pointers on things I've
> > missed
> > > or will eventually encounter on my own.
> > >
> > > I'm using zookeeper to coordinate topic-partition owners for consumer
> > > members in each consumer group. I followed the znode layout that's
> > apparent
> > > from watching the console consumer.
> > >
> > > <consumer_root>/<consumer_group_name>/{offsets,owners,ids}.
> > >
> > > The consumer uses an outer loop to discover the partition list for a
> > given
> > > topic, attempts to grab a zookeeper lock on each (topic,partition)
> tuple,
> > > and then for each (topic, partition) it successfully locks, launches a
> > > thread (goroutine) for each partition to read the partition stream.
> > >
> > > The outer loop continues to watch for children events either of:
> > >
> > >
> >
> <consumer_root>/<consumer_group>/owners/<topic_name><kafka_root>/brokers/topics/<topic_name>/partitions
> > >
> > > ...any watch event that fires causes all offset data and consumer
> handles
> > > to be flushed and closed, goroutines watching topic-partitions exit.
> The
> > > loop is restarted.
> > >
> > > Another thread reads topic-partition-offset data and flushes the offset
> > >
> >
> to:<consumer_root>/<consumer_group>/offsets/<topic_name/<partition_number>
> > >
> > > Have I oversimplified or missed any critical steps?
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Reply via email to