On Mon, Jan 27, 2014 at 4:19 PM, Guozhang Wang <wangg...@gmail.com> wrote:

> Hello David,
>
> One thing about using ZK locks to "own" a partition is load balancing. If
> you are unlucky some consumer may get all the locks and some may get none,
> hence have no partitions to consume.
>

I've considered this and even encountered it in testing. For our current
load levels, we won't hurt us, but if there's a good solution, I'd rather
codify smooth consumer balance.

Got any suggestions?

My thinking thus far is to establish some sort of identity on the consumer
and derive an evenness or oddness or some modulo value that induces a small
delay when encountering particular partition numbers. It's a hacky idea,
but is pretty simple and might be good enough for smoothing consumers.


> Also you may need some synchronization between the consumer thread with the
> offset thread. For example, when an event is fired and the consumers need
> to re-try grabbing the locks, it needs to first stop current fetchers,
> commit offsets, and then start owning new partitions.
>

This is current design and what I have implemented so far. The last thread
to exit is the offset thread and it has a direct communication channel to
the consumer threads so it waits for those channels to be closed before
it's last flush and exit.


> Guozhang
>
>
Thanks for the input!


>
> On Mon, Jan 27, 2014 at 3:03 PM, David Birdsong <david.birds...@gmail.com
> >wrote:
>
> > Hey All, I've been cobbling together a high-level consumer for golang
> > building on top of Shopify's Sarama package and wanted to run the basic
> > design by the list and get some feedback or pointers on things I've
> missed
> > or will eventually encounter on my own.
> >
> > I'm using zookeeper to coordinate topic-partition owners for consumer
> > members in each consumer group. I followed the znode layout that's
> apparent
> > from watching the console consumer.
> >
> > <consumer_root>/<consumer_group_name>/{offsets,owners,ids}.
> >
> > The consumer uses an outer loop to discover the partition list for a
> given
> > topic, attempts to grab a zookeeper lock on each (topic,partition) tuple,
> > and then for each (topic, partition) it successfully locks, launches a
> > thread (goroutine) for each partition to read the partition stream.
> >
> > The outer loop continues to watch for children events either of:
> >
> >
> <consumer_root>/<consumer_group>/owners/<topic_name><kafka_root>/brokers/topics/<topic_name>/partitions
> >
> > ...any watch event that fires causes all offset data and consumer handles
> > to be flushed and closed, goroutines watching topic-partitions exit. The
> > loop is restarted.
> >
> > Another thread reads topic-partition-offset data and flushes the offset
> >
> to:<consumer_root>/<consumer_group>/offsets/<topic_name/<partition_number>
> >
> > Have I oversimplified or missed any critical steps?
> >
>
>
>
> --
> -- Guozhang
>

Reply via email to