Re: Comparison of kafka and Hedwig?

Benjamin Reed Thu, 10 Feb 2011 10:37:03 -0800

i'm not sure cross posting is such a good idea, it is probably better todiscuss these comparisons relative to a project's viewpoint.

the problem space is similar, and i imagine as kafka develops theimplementations may become even closer.


there are a couple of obvious things:

1) hedwig has strong durability guarantees. kafka can lose data due tofailures.2) hedwig was designed for lots of topics (100,000s) with low fan out(few subscribers/publishers). i think kafka is designed for a few topicswith lots of subscribers and publishers.3) hedwig tracks subscribers progress for gc of publishes. kafka usestime based gc.4) hedwig will replay messages to subscribers starting from the lastmessage they explicitly consumed. kafka allows subscribers to replaymessages that they have already consumed.


there are probably others.

i really like the kafka design choices made for 3 and 4. hedwig willwork on scaling to more subscribers/publishers per topic. i imagine, ifneeded, kafka will work on their durability guarantees and support forlarge number of topics.


ben

On 02/10/2011 01:27 AM, Thomas Koch wrote:

Flavio Junqueira:

Thomas, Did you mean to say Hedwig instead of BookKeeper?

Oh sh..ugar yeah. Thanks. Start over again:

I've just had a look at the kafka slides[1] from January HUG. It seems to me,
that Hedwig[2] and kafka are quite similar in there problem space. Is that
so? What are notable differences?
(Kafka is written in scala and therefor must be a lot cooler :-)

[1]<http://developer.yahoo.com/blogs/hadoop/posts/2011/02/hadoop-user-group-
january-2011-recap/>
[2] http://cwiki.apache.org/confluence/display/ZOOKEEPER/HedWig

Thomas Koch, http://www.koch.ro

Re: Comparison of kafka and Hedwig?

Reply via email to