Hi Khurrum,

Is ready now.
https://github.com/Landoop/stream-reactor

Regards

Andrew


From: Khurrum Nasim
Sent: Thursday, 7 December, 08:36
Subject: Re: Comparing Pulsar and Kafka: unified queuing and streaming
To: dev@kafka.apache.org
Cc: us...@kafka.apache.org


Andrew, Thank you! Is there any estimation on when I can try out Kafka Connect 
with Pulsar? Can you also point me when I can find the Kafka-to-Pulsar source 
and sink? - KN On Wed, Dec 6, 2017 at 2:48 AM, Andrew Stevenson wrote: > In 
terms of building out the Apache Pulsar ecosystem, Landoop is working > on 
porting our Kafka Connect Connectors to Pulsars framework, > We already have a 
Kafka to Pulsar source and sink. > > > On 05/12/2017, 19:59, "Jason Gustafson" 
wrote: > > > I believe a lot of users are using the kafka high level consumers, 
> it is > > effectively an **unordered** messaging/streaming pattern. People > 
using high > > level consumers don't actually need any ordering guarantees. In 
this > sense, > > a *shared* subscription in Apache Pulsar seems to be better 
than > current > > Kafka's consumer group model, as it allows the consumption 
rate not > limited > > by the number of partitions, can actually grow beyond 
the number of > > partitions. We do see a lot of operational pain points on 
production > coming > > from consumer lags, which I think it is very commonly 
seen during > partition > > rebalancing in a consumer group. Selective acking 
seems to provide a > finer > > granularity on acknowledgment, which can be 
actually good for > avoiding > > consumer lags and avoid reprocessing messages 
during partition > rebalance. > > > Yeah, I'm not sure about this. I'd be 
interested to understand the > design > of this feature a little better. In 
practice, when ordering is > unimportant, > adding partitions seems not too big 
of a deal. Also, I'm aware of > active > efforts to make rebalancing less of a 
pain point for our users ;) > > The last question, from users perspective, 
since both kafka and pulsar > are > > distributed pub/sub messaging systems and 
both of them at the ASF, > is there > > any possibility for these two projects 
to collaborate, e.g. kafka > adopts > > pulsar's messaging model, pulsar can 
use kafka streams and kafka > connect. I > > believe a lot of people in the 
mailing list might have same or > similar > > question. From end-user 
perspective, if such collaboration can > happen, that > > is going to great for 
users and also the ASF. I would like to hear > any > > thoughts from kafka 
committers and pmc members. > > > I see this a little differently. Although 
there is some overlap > between the > projects, they have quite different 
underlying philosophies (as Marina > alluded to) and I hope this will take them 
on different trajectories > over > time. That would ultimately benefit users 
more than having two > competing > projects solving all the same use cases. We 
don't need to try to cram > Pulsar features into Kafka if it's not a good fit 
and vice versa. At > the > same time, where capabilities do overlap, we can try 
to learn from > their > experience and they can learn from ours. The example of 
message > retention > seemed like one of these instances since there are 
legitimate use > cases and > Pulsar's approach has some benefits. > > > -Jason 
> > > > On Tue, Dec 5, 2017 at 9:57 AM, Khurrum Nasim > > wrote: > > > Hi 
Marina, > > > > > > On Tue, Dec 5, 2017 at 6:58 AM, Marina Popova < > 
ppine7...@protonmail.com> > > wrote: > > > > > Hi, > > > I don't think it would 
be such a great idea to start modifying the > very > > > foundation of Kafka's 
design to accommodate more and more extra use > > cases. > > > Kafka because so 
widely adopted and popular because its creator > made a > > > brilliant 
decision to make it "dumb broker - smart consumer" type > of the > > > system, 
where there is no to minimal dependencies between Kafka > brokers > > and > > > 
Consumers. This is what make Kafka blazingly fast and truly > scalable - > > 
able > > > to handle thousands of Consumers with no impact on performance. > > 
> > > > > I am not sure I agree with this. I think from end-user perspective, > 
what > > users expect is a ultra simple streaming/messaging system: > 
applications > > sends messages, messaging systems store and dispatch them, 
consumers > > consume the messages and tell the systems that they already 
consumed > the > > messages. IMO whether a centralized management or 
decentralize > management > > doesn't really matter here if kafka is able to do 
things without > impacting > > performance. > > > > sometimes people assume 
that smarter brokers (like traditional > messaging > > brokers) can not offer 
high throughput and scalability, because they > do > > "too many things". but I 
took a look at Pulsar documentation and > their > > presentation. There are a 
few metrics very impressive: > > > > 
https://image.slidesharecdn.com/apachepulsar-171113225233/ > > 
95/bdam-multitenant-and-georeplication-messaging-with- > > 
apache-pulsar-by-matteo-merli-sijie-guo-from-streamlio-2- > > 
638.jpg?cb=1510613990 > > > > > 
95/bdam-multitenant-and-georeplication-messaging-with- > > 
apache-pulsar-by-matteo-merli-sijie-guo-from-streamlio-2- > > 
638.jpg?cb=1510613990>- > > 1.8 million messages/second per topic partition > > 
- 99pct producing latency less than 5ms with stronger durability > > - support 
millions of topics > > - it also supports at-least-once and effectively-once 
producing > > > > Those metrics sound appealing to me if pulsar supports both > 
streaming and > > queuing. I am wondering if anyone in the community tries to 
do a > > performance testing or benchmark between Pulsar and Kafka. I would > 
love to > > see such results that can help people understand both systems, pros 
> and > > cons. > > > > > > - KN > > > > > > > > > > > > One unfortunate 
consequence of becoming so popular - is that more > and > > more > > > people 
are trying to fit Kafka into their architectures not > because it > > > really 
fits, but because everybody else is doing so :) And this > causes > > many > > 
> requests to support more and more reacher functionality to be > added to > > 
> Kafka - like transactional messages, more complex acks, centralized > > > 
consumer management, etc. > > > > > > If you really need those feature - there 
are other systems that are > > > designed for that. > > > > > > I truly worry 
that if all those changes are added to Core Kafka - > it will > > > become just 
another "do it all" enterprise-level monster that will > be > > able > > > to 
do it all but at a price of mediocre performance and ten-fold > > increased > > 
> complexity (and, thus, management and possibility of bugs). Sure, > there > > 
has > > > to be innovation and new features added - but maybe those that > 
require > > > major changes to the Kafka's core principles should go into > 
separate > > > frameworks, plug-ing (like Connectors) or something in that 
line, > rather > > > that packing it all into the Core Kafka. > > > > > > Just 
my 2 cents :) > > > > > > Marina > > > > > > Sent with 
[ProtonMail](https://protonmail.com) Secure Email. > > > > > > > -------- 
Original Message -------- > > > > Subject: Re: Comparing Pulsar and Kafka: 
unified queuing and > streaming > > > > Local Time: December 4, 2017 2:56 PM > 
> > > UTC Time: December 4, 2017 7:56 PM > > > > From: ja...@confluent.io > > > 
> To: dev@kafka.apache.org > > > > Kafka Users > > > > > > > > Hi Khurrum, > > 
> > > > > > Thanks for sharing the article. I think one interesting aspect of > 
> Pulsar > > > > that stands out to me is its notion of a subscription and how 
it > > impacts > > > > message retention. In Kafka, consumers are more loosely 
coupled > and > > > > retention is enforced independently of consumption. There 
are > some > > > > scenarios I can imagine where the tighter coupling might be 
> beneficial. > > > For > > > > example, in Kafka Streams, we often use 
intermediate topics to > store > > the > > > > data in one stage of the 
topology's computation. These topics are > > > > exclusively owned by the 
application and once the messages have > been > > > > successfully received by 
the next stage, we do not need to > retain them > > > > further. But since 
consumption is independent of retention, we > either > > > have > > > > to 
choose a large retention time and deal with some temporary > storage > > > 
waste > > > > or we use a low retention time and possibly lose some messages > 
during > > an > > > > outage. > > > > > > > > We have solved this problem to 
some extent in Kafka by > introducing an > > API > > > > to delete the records 
in a partition up to a certain offset, but > this > > > > effectively puts the 
burden of this use case on clients. It > would be > > > > interesting to 
consider whether we could do something like > Pulsar in > > the > > > > Kafka 
broker. For example, we have a consumer group coordinator > which > > is > > > 
> able to track the progress of the group through its committed > offsets. > > 
It > > > > might be possible to extend it to automatically delete records > in 
a > > topic > > > > after offsets are committed if the topic is known to be > 
exclusively > > owned > > > > by the consumer group. We already have the 
DeleteRecords API > that need, > > > so > > > > maybe this is "just" a matter 
of some additional topic metadata. > I'd be > > > > interested to hear whether 
this kind of use case is common among > our > > > users. > > > > > > > > -Jason 
> > > > > > > > On Sun, Dec 3, 2017 at 10:29 PM, Khurrum Nasim > 
khurrumnas...@gmail.com > > > > wrote: > > > > > > > >> Dear Kafka Community, > 
> > >> I happened to read this blog post comparing the messaging model > > 
between > > > >> Apache Pulsar and Apache Kafka. It sounds interesting. Apache 
> Pulsar > > > claims > > > >> to unify streaming (kafka) and queuing 
(rabbitmq) in one > unified API. > > > >> Pulsar also seems to support Kafka 
API. Have anyone taken a > look at > > > Pulsar? > > > >> How does the 
community think about this? Pulsar is also an > Apache > > > project. > > > >> 
Is there any collaboration can happen between these two > projects? > > > >> 
https://streaml.io/blog/pulsar-streaming-queuing/ > > > >> BTW, I am a Kafka 
user, loving Kafka a lot. Just try to see > what other > > > >> people think 
about this. > > > >> > > > >> - KN > > > > > > > > >

Reply via email to