Re: Comparing Pulsar and Kafka: unified queuing and streaming

Marina Popova Tue, 05 Dec 2017 08:15:27 -0800

Hi,
I don't think it would be such a great idea to start modifying the very 
foundation of Kafka's design to accommodate more and more extra use cases.
Kafka because so widely adopted and popular because its creator made a 
brilliant decision to make it "dumb broker - smart consumer" type of the 
system, where there is no to minimal dependencies between Kafka brokers and 
Consumers. This is what make Kafka blazingly fast and truly scalable - able to 
handle thousands of Consumers with no impact on performance.


One unfortunate consequence of becoming so popular - is that more and more 
people are trying to fit Kafka into their architectures not because it really 
fits, but because everybody else is doing so :) And this causes many requests 
to support more and more reacher functionality to be added to Kafka - like 
transactional messages, more complex acks, centralized consumer management, etc.

If you really need those feature - there are other systems that are designed 
for that.

I truly worry that if all those changes are added to Core Kafka - it will 
become just another "do it all" enterprise-level monster that will be able to 
do it all but at a price of mediocre performance and ten-fold increased 
complexity (and, thus, management and possibility of bugs). Sure, there has to 
be innovation and new features added - but maybe those that require major 
changes to the Kafka's core principles should go into separate frameworks, 
plug-ing (like Connectors) or something in that line, rather that packing it 
all into the Core Kafka.

Just my 2 cents :)

Marina

Sent with [ProtonMail](https://protonmail.com) Secure Email.

> -------- Original Message --------
> Subject: Re: Comparing Pulsar and Kafka: unified queuing and streaming
> Local Time: December 4, 2017 2:56 PM
> UTC Time: December 4, 2017 7:56 PM
> From: ja...@confluent.io
> To: dev@kafka.apache.org
> Kafka Users <us...@kafka.apache.org>
>
> Hi Khurrum,
>
> Thanks for sharing the article. I think one interesting aspect of Pulsar
> that stands out to me is its notion of a subscription and how it impacts
> message retention. In Kafka, consumers are more loosely coupled and
> retention is enforced independently of consumption. There are some
> scenarios I can imagine where the tighter coupling might be beneficial. For
> example, in Kafka Streams, we often use intermediate topics to store the
> data in one stage of the topology's computation. These topics are
> exclusively owned by the application and once the messages have been
> successfully received by the next stage, we do not need to retain them
> further. But since consumption is independent of retention, we either have
> to choose a large retention time and deal with some temporary storage waste
> or we use a low retention time and possibly lose some messages during an
> outage.
>
> We have solved this problem to some extent in Kafka by introducing an API
> to delete the records in a partition up to a certain offset, but this
> effectively puts the burden of this use case on clients. It would be
> interesting to consider whether we could do something like Pulsar in the
> Kafka broker. For example, we have a consumer group coordinator which is
> able to track the progress of the group through its committed offsets. It
> might be possible to extend it to automatically delete records in a topic
> after offsets are committed if the topic is known to be exclusively owned
> by the consumer group. We already have the DeleteRecords API that need, so
> maybe this is "just" a matter of some additional topic metadata. I'd be
> interested to hear whether this kind of use case is common among our users.
>
> -Jason
>
> On Sun, Dec 3, 2017 at 10:29 PM, Khurrum Nasim khurrumnas...@gmail.com
> wrote:
>
>> Dear Kafka Community,
>> I happened to read this blog post comparing the messaging model between
>> Apache Pulsar and Apache Kafka. It sounds interesting. Apache Pulsar claims
>> to unify streaming (kafka) and queuing (rabbitmq) in one unified API.
>> Pulsar also seems to support Kafka API. Have anyone taken a look at Pulsar?
>> How does the community think about this? Pulsar is also an Apache project.
>> Is there any collaboration can happen between these two projects?
>> https://streaml.io/blog/pulsar-streaming-queuing/
>> BTW, I am a Kafka user, loving Kafka a lot. Just try to see what other
>> people think about this.
>>
>> - KN

Re: Comparing Pulsar and Kafka: unified queuing and streaming

Reply via email to