Hi, I don't think it would be such a great idea to start modifying the very foundation of Kafka's design to accommodate more and more extra use cases. Kafka because so widely adopted and popular because its creator made a brilliant decision to make it "dumb broker - smart consumer" type of the system, where there is no to minimal dependencies between Kafka brokers and Consumers. This is what make Kafka blazingly fast and truly scalable - able to handle thousands of Consumers with no impact on performance.
One unfortunate consequence of becoming so popular - is that more and more people are trying to fit Kafka into their architectures not because it really fits, but because everybody else is doing so :) And this causes many requests to support more and more reacher functionality to be added to Kafka - like transactional messages, more complex acks, centralized consumer management, etc. If you really need those feature - there are other systems that are designed for that. I truly worry that if all those changes are added to Core Kafka - it will become just another "do it all" enterprise-level monster that will be able to do it all but at a price of mediocre performance and ten-fold increased complexity (and, thus, management and possibility of bugs). Sure, there has to be innovation and new features added - but maybe those that require major changes to the Kafka's core principles should go into separate frameworks, plug-ing (like Connectors) or something in that line, rather that packing it all into the Core Kafka. Just my 2 cents :) Marina Sent with [ProtonMail](https://protonmail.com) Secure Email. > -------- Original Message -------- > Subject: Re: Comparing Pulsar and Kafka: unified queuing and streaming > Local Time: December 4, 2017 2:56 PM > UTC Time: December 4, 2017 7:56 PM > From: ja...@confluent.io > To: dev@kafka.apache.org > Kafka Users <us...@kafka.apache.org> > > Hi Khurrum, > > Thanks for sharing the article. I think one interesting aspect of Pulsar > that stands out to me is its notion of a subscription and how it impacts > message retention. In Kafka, consumers are more loosely coupled and > retention is enforced independently of consumption. There are some > scenarios I can imagine where the tighter coupling might be beneficial. For > example, in Kafka Streams, we often use intermediate topics to store the > data in one stage of the topology's computation. These topics are > exclusively owned by the application and once the messages have been > successfully received by the next stage, we do not need to retain them > further. But since consumption is independent of retention, we either have > to choose a large retention time and deal with some temporary storage waste > or we use a low retention time and possibly lose some messages during an > outage. > > We have solved this problem to some extent in Kafka by introducing an API > to delete the records in a partition up to a certain offset, but this > effectively puts the burden of this use case on clients. It would be > interesting to consider whether we could do something like Pulsar in the > Kafka broker. For example, we have a consumer group coordinator which is > able to track the progress of the group through its committed offsets. It > might be possible to extend it to automatically delete records in a topic > after offsets are committed if the topic is known to be exclusively owned > by the consumer group. We already have the DeleteRecords API that need, so > maybe this is "just" a matter of some additional topic metadata. I'd be > interested to hear whether this kind of use case is common among our users. > > -Jason > > On Sun, Dec 3, 2017 at 10:29 PM, Khurrum Nasim khurrumnas...@gmail.com > wrote: > >> Dear Kafka Community, >> I happened to read this blog post comparing the messaging model between >> Apache Pulsar and Apache Kafka. It sounds interesting. Apache Pulsar claims >> to unify streaming (kafka) and queuing (rabbitmq) in one unified API. >> Pulsar also seems to support Kafka API. Have anyone taken a look at Pulsar? >> How does the community think about this? Pulsar is also an Apache project. >> Is there any collaboration can happen between these two projects? >> https://streaml.io/blog/pulsar-streaming-queuing/ >> BTW, I am a Kafka user, loving Kafka a lot. Just try to see what other >> people think about this. >> >> - KN