Did so. The proposal looks perfectly sensible on first reading.

I understand that the patches in 
https://issues.apache.org/jira/browse/KAFKA-657 are already in the trunk and 
scheduled for 0.8.1? Are they going out with 0.8? If not, what's ETA for 0.8.1?

Either way, I'm going to try my hand at backing this with MySQL and report the 
results here shortly.

-- 
"If you can't conceal it well, expose it with all your might"
Alex Zuzin


On Monday, May 20, 2013 at 10:24 AM, Neha Narkhede wrote:

> No problem. You can take a look at some of the thoughts we had on improving
> the offset storage here -
> https://cwiki.apache.org/confluence/display/KAFKA/ffset+Management 
> (https://cwiki.apache.org/confluence/display/KAFKA/Offset+Management).
> Suggestions are welcome.
> 
> Thanks,
> Neha
> 
> 
> On Fri, May 17, 2013 at 2:40 PM, Alex Zuzin <carna...@gmail.com 
> (mailto:carna...@gmail.com)> wrote:
> 
> > Neha,
> > 
> > apologies, I just re-read what I sent and realized my "you" wasn't
> > specific enough - it meant the Kafka team ;).
> > 
> > --
> > "If you can't conceal it well, expose it with all your might"
> > Alex Zuzin
> > 
> > 
> > On Friday, May 17, 2013 at 2:25 PM, Alex Zuzin wrote:
> > 
> > > Have you considered abstracting offset storage away so people could
> > implement their own?
> > > Would you take a patch if I'd stabbed at it, and if yes, what's the
> > 
> > process (pardon the n00b)?
> > > 
> > > KCBO,
> > > --
> > > "If you can't conceal it well, expose it with all your might"
> > > Alex Zuzin
> > > 
> > > 
> > > On Friday, May 17, 2013 at 2:22 PM, Neha Narkhede wrote:
> > > 
> > > > There is no particular need for storing the offsets in zookeeper. In
> > fact
> > > > with Kafka 0.8, since partitions will be highly available, offsets
> > > 
> > 
> > could be
> > > > stored in Kafka topics. However, we haven't ironed out the design for
> > > 
> > 
> > this
> > > > yet.
> > > > 
> > > > Thanks,
> > > > Neha
> > > > 
> > > > 
> > > > On Fri, May 17, 2013 at 2:19 PM, Scott Clasen <sc...@heroku.com 
> > > > (mailto:sc...@heroku.com)(mailto:
> > sc...@heroku.com (mailto:sc...@heroku.com))> wrote:
> > > > 
> > > > > afaik you dont 'have' to store the consumed offsets in zk right,
> > this is
> > > > > only automatic with some of the clients?
> > > > > 
> > > > > why not store them in a data store that can write at the rate that
> > you
> > > > > require?
> > > > > 
> > > > > 
> > > > > On Fri, May 17, 2013 at 2:15 PM, Withers, Robert <
> > robert.with...@dish.com (mailto:robert.with...@dish.com)
> > > > > > wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > > Update from our OPS team, regarding zookeeper 3.4.x. Given
> > stability,
> > > > > > adoption of offset batching would be the only remaining bit of
> > > > > 
> > > > 
> > > 
> > 
> > work to
> > > > > > 
> > > > > 
> > > > > 
> > > > > do.
> > > > > > Still, I totally understand the restraint for 0.8...
> > > > > > 
> > > > > > 
> > > > > > "As exercise in upgradability of zookeeper, I did a
> > "out-of-the"box"
> > > > > > upgrade on Zookeeper. I downloaded a generic distribution of Apache
> > > > > > Zookeeper and used it for the upgrade.
> > > > > > 
> > > > > > Kafka included version of Zookeeper 3.3.3.
> > > > > > Out of the box Apache Zookeeper 3.4.5 (which I upgraded to)
> > > > > > 
> > > > > > Running, working great. I did *not* have to wipe out the zookeeper
> > > > > > databases. All data stayed intact.
> > > > > > 
> > > > > > I got a new feature, which allows auto-purging of logs. This keeps
> > OPS
> > > > > > maintenance to a minimum."
> > > > > > 
> > > > > > 
> > > > > > thanks,
> > > > > > rob
> > > > > > 
> > > > > > 
> > > > > > -----Original Message-----
> > > > > > From: Withers, Robert [mailto:robert.with...@dish.com]
> > > > > > Sent: Friday, May 17, 2013 7:38 AM
> > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > > > 
> > > > > > Fair enough, this is something to look forward to. I appreciate the
> > > > > > restraint you show to stay out of troubled waters. :)
> > > > > > 
> > > > > > thanks,
> > > > > > rob
> > > > > > 
> > > > > > ________________________________________
> > > > > > From: Neha Narkhede [neha.narkh...@gmail.com 
> > > > > > (mailto:neha.narkh...@gmail.com) (mailto:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > neha.narkh...@gmail.com (mailto:neha.narkh...@gmail.com))]
> > > > > > Sent: Friday, May 17, 2013 7:35 AM
> > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > > > 
> > > > > > Upgrading to a new zookeeper version is not an easy change. Also
> > > > > zookeeper
> > > > > > 3.3.4 is much more stable compared to 3.4.x. We think it is better
> > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > not to
> > > > > > club 2 big changes together. So most likely this will be a post 08
> > > > > 
> > > > 
> > > 
> > 
> > item
> > > > > > 
> > > > > 
> > > > > 
> > > > > for
> > > > > > stability purposes.
> > > > > > 
> > > > > > Thanks,
> > > > > > Neha
> > > > > > On May 17, 2013 6:31 AM, "Withers, Robert" <
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > robert.with...@dish.com (mailto:robert.with...@dish.com)>
> > > > > > wrote:
> > > > > > 
> > > > > > > Awesome! Thanks for the clarification. I would like to offer my
> > > > > > > strong vote that this get tackled before a beta, to get it
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > firmly into
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 0.8.
> > > > > > > Stabilize everything else to the existing use, but make offset
> > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > updates
> > > > > > > batched.
> > > > > > > 
> > > > > > > thanks,
> > > > > > > rob
> > > > > > > ________________________________________
> > > > > > > From: Neha Narkhede [neha.narkh...@gmail.com 
> > > > > > > (mailto:neha.narkh...@gmail.com) (mailto:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > neha.narkh...@gmail.com (mailto:neha.narkh...@gmail.com))]
> > > > > > > Sent: Friday, May 17, 2013 7:17 AM
> > > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > > > > > 
> > > > > > > Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon
> > as 08
> > > > > > > is stable and released it will be worth looking into when we can
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > use
> > > > > > > zookeeper 3.4.x.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Neha
> > > > > > > On May 16, 2013 10:32 PM, "Rob Withers" <reefed...@gmail.com 
> > > > > > > (mailto:reefed...@gmail.com)(mailto:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > reefed...@gmail.com (mailto:reefed...@gmail.com))> wrote:
> > > > > > > 
> > > > > > > > Can a request be made to zookeeper for this feature?
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > rob
> > > > > > > > 
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Neha Narkhede [mailto:neha.narkh...@gmail.com]
> > > > > > > > > Sent: Thursday, May 16, 2013 9:53 PM
> > > > > > > > > To: users@kafka.apache.org (mailto:users@kafka.apache.org)
> > > > > > > > > Subject: Re: are commitOffsets botched to zookeeper?
> > > > > > > > > 
> > > > > > > > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have
> > a
> > > > > > > > > batch
> > > > > > > > 
> > > > > > > > 
> > > > > > > > write
> > > > > > > > > api. So if you commit after every message at a high rate, it
> > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > will
> > > > > > > > > be
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > slow
> > > > > > > > and
> > > > > > > > > inefficient. Besides it will cause zookeeper performance to
> > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > > degrade.
> > > > > > > > > 
> > > > > > > > > Thanks,
> > > > > > > > > Neha
> > > > > > > > > On May 16, 2013 6:54 PM, "Rob Withers" <reefed...@gmail.com 
> > > > > > > > > (mailto:reefed...@gmail.com)(mailto:
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > reefed...@gmail.com (mailto:reefed...@gmail.com))>
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > > wrote:
> > > > > > > > > 
> > > > > > > > > > We are calling commitOffsets after every message
> > consumption.
> > > > > > > > > > It looks to be ~60% slower, with 29 partitions. If a single
> > > > > > > > > > KafkaStream thread is from a connector, and there are 29
> > > > > > > > > > partitions, then commitOffsets sends 29 offset updates,
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > correct?
> > > > > > > > > > Are these offset updates batched in one send to zookeeper?
> > > > > > > > > > 
> > > > > > > > > > thanks,
> > > > > > > > > > rob
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > 
> 
> 
> 


Reply via email to