Re: [PROPOSAL] Add commercial support page on website

2024-01-11 Thread Francois Papon

Hi Justine,

You're right, Kafka is a part of my business (training, consulting, 
architecture design, sla...) and most of the time, users/customers said 
that it was hard for them to find a commercial support (in France for my 
case) after searching on the Kafka website (Google didn't help them).


As an ASF member and PMC of several ASF projects, I know that this kind 
of page exist so this is why I made this proposal for the Kafka project 
because I really think that it can help users.


As you suggest, I can submit a PR to be added on the "powered by" page.

Thanks,

François

On 11/01/2024 21:00, Justine Olshan wrote:

Hey François,

My point was that the companies on that page use kafka as part of their
business. If you use Kafka as part of your business feel free to submit a
PR to be added.

I second Chris's point that other projects are not enough to require Kafka
having such a support page.

Justine

On Thu, Jan 11, 2024 at 11:57 AM Chris Egerton 
wrote:


Hi François,

Is it an official policy of the ASF that projects provide a listing of
commercial support options for themselves? I understand that other projects
have chosen to provide one, but this doesn't necessarily imply that all
projects should do the same, and I can't say I find this point very
convincing as a rebuttal to some of the good-faith concerns raised by the
PMC and members of the community so far. However, if there's an official
ASF stance on this topic, then I acknowledge that Apache Kafka should align
with it.

Best,

Chris


On Thu, Jan 11, 2024, 14:50 fpapon  wrote:


Hi Justine,

I'm not sure to see the difference between "happy users" and vendors
that advertise their products in some of the company list in the
"powered by" page.

Btw, my initial purpose of my proposal was to help user to find support
for production stuff rather than searching in google.

I don't think this is a bad thing because this is something that already
exist in many ASF projects like:

https://hop.apache.org/community/commercial/
https://struts.apache.org/commercial-support.html
https://directory.apache.org/commercial-support.html
https://tomee.apache.org/commercial-support.html
https://plc4x.apache.org/users/commercial-support.html
https://camel.apache.org/community/support/
https://openmeetings.apache.org/commercial-support.html
https://guacamole.apache.org/support/



https://cwiki.apache.org/confluence/display/HADOOP2/Distributions+and+Commercial+Support
https://activemq.apache.org/supporthttps://karaf.apache.org/community.html

https://netbeans.apache.org/front/main/help/commercial-support/
https://royale.apache.org/royale-commercial-support/

https://karaf.apache.org/community.html

As I understand for now, the channel for users to find production
support is:

- The mailing list (u...@kafka.apache.org / d...@kafka.apache.org)

- The official #kafka  ASF Slack channel (may be we can add it on the
website because I didn't find it in the website =>
https://kafka.apache.org/contact)

- Search in google for commercial support only

I can update my PR to mention only the 3 points above for the "get
support" page if people think that having a support page make sense.

regards,

François

On 11/01/2024 19:34, Justine Olshan wrote:

I think there is a difference between the "Powered by" page and a page

for

vendors to advertise their products and services.

The idea is that the companies on that page are "powered by" Kafka.

They

serve as examples of happy users of Kafka.
I don't think it is meant only as a place just for those companies to
advertise.

I'm a little confused by


In this case, I'm ok to say that the commercial support section in the

"Get support" is no need as we can use this page.

If you plan to submit for this page, please include a description on

how

your company uses Kafka.

I'm happy to hear other folks' opinions on this page as well.

Thanks,
Justine



On Thu, Jan 11, 2024 at 8:57 AM fpapon  wrote:


Hi,

About the vendors list and neutrality, what is the policy of the
"Powered by" page?

https://kafka.apache.org/powered-by

We can see company with logo, some are talking about their product
(Agoora), some are offering services (Instaclustr, Aiven), and we can
also see some that just put their logo and a link to their website
without any explanation (GoldmanSachs).

So as I understand and after reading the text in the footer of this
page, every company can add themselves by providing a PR right?

"Want to appear on this page?
Submit a pull request or send a quick description of your organization
and usage to the mailing list and we'll add you."

In this case, I'm ok to say that the commercial support section in the
"Get support" is no need as we can use this page.

regards,

François


On 10/01/2024 19:03, Kenneth Eversole wrote:

I agree with Divji here and to be more pointed. I worry that if we go

down

the path of adding vendors to a list it comes off as supporting their
product, not to mention could be a huge security 

Re: Mirror Maker bidirectional offset sync

2024-01-11 Thread Ryanne Dolan
> b.a.replicate-me-0

That's actually impossible with MM2. It won't allow such topics. It would
be pointless to replicate data from A back to A, or from B back to B. The
data is already there.

MM2's replication logic is significantly more advanced than just adding a
prefix. Tho it does appear that way for simple use-cases.

Also note that the choice of storing offsets upstream or downstream came
later, and was motivated by asymmetric topologies like cloud migrations,
not active/active DR. Fail over and fail back will work with the default.
Otherwise you actually need both clusters available in order to failover
and fail back, since the offsets would be stored upstream (which is
presumably unavailable). That sorta defeats the point of having two
clusters.

Ryanne

On Thu, Jan 11, 2024, 1:05 PM Greg Harris 
wrote:

> Hey Jeroen,
>
> Thanks for sharing your prototype! It is very interesting!
>
> > I couldn't reproduce your hypothesis.
>
> I think my hypothesis was for another setup which didn't involve code
> changes, and instead relied on A->B->A round trip replication to
> produce the "backwards" offset syncs.
> I believe this would replicate data from "replicate-me-0" to
> "b.a.replicate-me-0", and then possibly take the offsets intended for
> "b.a.replicate-me-0" and apply them to "replicate-me-0" creating the
> infinite cycle.
> I would not expect your implementation to suffer from this failure
> mode, because it's using the offset in "replicate-me-0" as the
> downstream offset, not the offset of "b.a.replicate-me-0".
>
> With your prototype, do you experience "collisions" in the
> offset-syncs topic? Since you're sharing a single offset-syncs topic
> between both replication flows, I would expect offsets for topics with
> the same names on both clusters to conflict, and cause the translation
> to happen using the opposite topic's offsets.
> It would also be visible in the state of the OffsetSyncStore here:
> [1], you can compare the normal A->B behavior before and after
> starting the B -> A source connector to see if the concurrent flows
> causes more syncs to be cleared, or the wrong syncs to be present.
>
> I think it is normal for every MM2 connector to have the same
> offset-syncs.topic.location to avoid these sorts of conflicts, so that
> each syncs topic is only used by one of the MM2 replication flows.
> I think that turning on bidirectional offset syncs will probably
> require a second producer in the MirrorSourceTask to contact the
> opposite cluster, or a second admin client in the
> MirrorCheckpointTask.
>
> > Do you think it'd be worthwhile proceeding with this?
>
> This is certainly a capability that MM2 is missing right now, and
> seems like it would be a natural component of failing consumers back
> and forth. If you see value in it, and are interested in driving the
> feature, you can open a KIP [2] to discuss the interface and design
> with the rest of the community.
>
> Thanks!
> Greg
>
> [1]
> https://github.com/apache/kafka/blob/2c6fb6c54472e90ae17439e62540ef3cb0426fe3/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/OffsetSyncStore.java#L194
> [2]
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
>
> On Thu, Jan 11, 2024 at 9:27 AM Jeroen Schutrup
>  wrote:
> >
> > I see, makes complete sense to me. I built a custom version [1] based off
> > of Kafka 3.5.1 with bidirectional offset replication enabled so I could
> do
> > some more testing. Offset translation back upstream works well; I think
> > because of the reason Ryanne pointed out, both topics contain identical
> > data. Tested this by truncating the upstream topic before starting
> > replication (so the downstream/upstream topics have different offsets).
> > Truncating the upstream topic while replication is running neither
> results
> > in any weirdness.
> >
> > > Before starting the replication, insert a few records into
> > `a.replicate-me` to force replicate-me-0's offset n to replicate to
> > a.replicate-me-0's offset n+k.
> > I couldn't reproduce your hypothesis. After doing the above and then
> > starting replication I didn't see any offset replication loops. Once I
> > started producing data into the upstream topic and subscribing a
> > console-consumer on the downstream topic, offsets were translated and
> > replicated correctly back upstream. My guess is the CheckpointConnector
> can
> > offset these surplus of messages as the actual log offsets of the
> > downstream topic are written to the offset-sync topic.
> >
> > As this kind of active/active replication would be very beneficial to us
> > for reasons stated in my previous message, we'd love to help out building
> > this kind of offset replication into the Mirror connectors. I understand
> > this is not something that should be enabled by default, but having it
> > behind configuration toggle could help out users desiring a similar kind
> of
> > active/active setup and who understand the restrictions. Do you think
> 

Re: [PROPOSAL] Add commercial support page on website

2024-01-11 Thread fpapon

Hi Chris,

I never said that the Apache Kafka community "has to" provide this kind 
of page and it's not an official policy of the ASF.


I just listed other projects to show that this is something that already 
exist so this is potentially something that could be good for the 
community of an ASF project.


My proposal is just to help the project to growth and to help users to 
find production support because this is not the purpose of the ASF.


If the PMC and members of the community are not agree and think this is 
a bad thing for the project, I'm ok with that and I will close my PR.


regards,

François

On 11/01/2024 20:56, Chris Egerton wrote:

Hi François,

Is it an official policy of the ASF that projects provide a listing of
commercial support options for themselves? I understand that other projects
have chosen to provide one, but this doesn't necessarily imply that all
projects should do the same, and I can't say I find this point very
convincing as a rebuttal to some of the good-faith concerns raised by the
PMC and members of the community so far. However, if there's an official
ASF stance on this topic, then I acknowledge that Apache Kafka should align
with it.

Best,

Chris


On Thu, Jan 11, 2024, 14:50 fpapon  wrote:


Hi Justine,

I'm not sure to see the difference between "happy users" and vendors
that advertise their products in some of the company list in the
"powered by" page.

Btw, my initial purpose of my proposal was to help user to find support
for production stuff rather than searching in google.

I don't think this is a bad thing because this is something that already
exist in many ASF projects like:

https://hop.apache.org/community/commercial/
https://struts.apache.org/commercial-support.html
https://directory.apache.org/commercial-support.html
https://tomee.apache.org/commercial-support.html
https://plc4x.apache.org/users/commercial-support.html
https://camel.apache.org/community/support/
https://openmeetings.apache.org/commercial-support.html
https://guacamole.apache.org/support/

https://cwiki.apache.org/confluence/display/HADOOP2/Distributions+and+Commercial+Support
https://activemq.apache.org/supporthttps://karaf.apache.org/community.html
https://netbeans.apache.org/front/main/help/commercial-support/
https://royale.apache.org/royale-commercial-support/

https://karaf.apache.org/community.html

As I understand for now, the channel for users to find production
support is:

- The mailing list (u...@kafka.apache.org / d...@kafka.apache.org)

- The official #kafka  ASF Slack channel (may be we can add it on the
website because I didn't find it in the website =>
https://kafka.apache.org/contact)

- Search in google for commercial support only

I can update my PR to mention only the 3 points above for the "get
support" page if people think that having a support page make sense.

regards,

François

On 11/01/2024 19:34, Justine Olshan wrote:

I think there is a difference between the "Powered by" page and a page

for

vendors to advertise their products and services.

The idea is that the companies on that page are "powered by" Kafka. They
serve as examples of happy users of Kafka.
I don't think it is meant only as a place just for those companies to
advertise.

I'm a little confused by


In this case, I'm ok to say that the commercial support section in the

"Get support" is no need as we can use this page.

If you plan to submit for this page, please include a description on how
your company uses Kafka.

I'm happy to hear other folks' opinions on this page as well.

Thanks,
Justine



On Thu, Jan 11, 2024 at 8:57 AM fpapon  wrote:


Hi,

About the vendors list and neutrality, what is the policy of the
"Powered by" page?

https://kafka.apache.org/powered-by

We can see company with logo, some are talking about their product
(Agoora), some are offering services (Instaclustr, Aiven), and we can
also see some that just put their logo and a link to their website
without any explanation (GoldmanSachs).

So as I understand and after reading the text in the footer of this
page, every company can add themselves by providing a PR right?

"Want to appear on this page?
Submit a pull request or send a quick description of your organization
and usage to the mailing list and we'll add you."

In this case, I'm ok to say that the commercial support section in the
"Get support" is no need as we can use this page.

regards,

François


On 10/01/2024 19:03, Kenneth Eversole wrote:

I agree with Divji here and to be more pointed. I worry that if we go

down

the path of adding vendors to a list it comes off as supporting their
product, not to mention could be a huge security risk for novice

users. I

would rather this be a callout to other purely open source tooling,

such

as

cruise control.

Divji brings up good question
1.  What value does additional of this page bring to the users of

Apache

Kafka?

I think the community would be a better service to have a more

synchronous

line of 

Re: [PROPOSAL] Add commercial support page on website

2024-01-11 Thread Chris Egerton
Hi François,

Is it an official policy of the ASF that projects provide a listing of
commercial support options for themselves? I understand that other projects
have chosen to provide one, but this doesn't necessarily imply that all
projects should do the same, and I can't say I find this point very
convincing as a rebuttal to some of the good-faith concerns raised by the
PMC and members of the community so far. However, if there's an official
ASF stance on this topic, then I acknowledge that Apache Kafka should align
with it.

Best,

Chris


On Thu, Jan 11, 2024, 14:50 fpapon  wrote:

> Hi Justine,
>
> I'm not sure to see the difference between "happy users" and vendors
> that advertise their products in some of the company list in the
> "powered by" page.
>
> Btw, my initial purpose of my proposal was to help user to find support
> for production stuff rather than searching in google.
>
> I don't think this is a bad thing because this is something that already
> exist in many ASF projects like:
>
> https://hop.apache.org/community/commercial/
> https://struts.apache.org/commercial-support.html
> https://directory.apache.org/commercial-support.html
> https://tomee.apache.org/commercial-support.html
> https://plc4x.apache.org/users/commercial-support.html
> https://camel.apache.org/community/support/
> https://openmeetings.apache.org/commercial-support.html
> https://guacamole.apache.org/support/
>
> https://cwiki.apache.org/confluence/display/HADOOP2/Distributions+and+Commercial+Support
> https://activemq.apache.org/supporthttps://karaf.apache.org/community.html
> https://netbeans.apache.org/front/main/help/commercial-support/
> https://royale.apache.org/royale-commercial-support/
>
> https://karaf.apache.org/community.html
>
> As I understand for now, the channel for users to find production
> support is:
>
> - The mailing list (u...@kafka.apache.org / d...@kafka.apache.org)
>
> - The official #kafka  ASF Slack channel (may be we can add it on the
> website because I didn't find it in the website =>
> https://kafka.apache.org/contact)
>
> - Search in google for commercial support only
>
> I can update my PR to mention only the 3 points above for the "get
> support" page if people think that having a support page make sense.
>
> regards,
>
> François
>
> On 11/01/2024 19:34, Justine Olshan wrote:
> > I think there is a difference between the "Powered by" page and a page
> for
> > vendors to advertise their products and services.
> >
> > The idea is that the companies on that page are "powered by" Kafka. They
> > serve as examples of happy users of Kafka.
> > I don't think it is meant only as a place just for those companies to
> > advertise.
> >
> > I'm a little confused by
> >
> >> In this case, I'm ok to say that the commercial support section in the
> > "Get support" is no need as we can use this page.
> >
> > If you plan to submit for this page, please include a description on how
> > your company uses Kafka.
> >
> > I'm happy to hear other folks' opinions on this page as well.
> >
> > Thanks,
> > Justine
> >
> >
> >
> > On Thu, Jan 11, 2024 at 8:57 AM fpapon  wrote:
> >
> >> Hi,
> >>
> >> About the vendors list and neutrality, what is the policy of the
> >> "Powered by" page?
> >>
> >> https://kafka.apache.org/powered-by
> >>
> >> We can see company with logo, some are talking about their product
> >> (Agoora), some are offering services (Instaclustr, Aiven), and we can
> >> also see some that just put their logo and a link to their website
> >> without any explanation (GoldmanSachs).
> >>
> >> So as I understand and after reading the text in the footer of this
> >> page, every company can add themselves by providing a PR right?
> >>
> >> "Want to appear on this page?
> >> Submit a pull request or send a quick description of your organization
> >> and usage to the mailing list and we'll add you."
> >>
> >> In this case, I'm ok to say that the commercial support section in the
> >> "Get support" is no need as we can use this page.
> >>
> >> regards,
> >>
> >> François
> >>
> >>
> >> On 10/01/2024 19:03, Kenneth Eversole wrote:
> >>> I agree with Divji here and to be more pointed. I worry that if we go
> >> down
> >>> the path of adding vendors to a list it comes off as supporting their
> >>> product, not to mention could be a huge security risk for novice
> users. I
> >>> would rather this be a callout to other purely open source tooling,
> such
> >> as
> >>> cruise control.
> >>>
> >>> Divji brings up good question
> >>> 1.  What value does additional of this page bring to the users of
> Apache
> >>> Kafka?
> >>>
> >>> I think the community would be a better service to have a more
> >> synchronous
> >>> line of communication such as Slack/Discord and we call that out here.
> It
> >>> would be more inline with other major open source projects.
> >>>
> >>> ---
> >>> Kenneth Eversole
> >>>
> >>> On Wed, Jan 10, 2024 at 10:30 AM Divij Vaidya  >
> >>> wrote:
> >>>
>  I don't see a need for this. What 

Re: [PROPOSAL] Add commercial support page on website

2024-01-11 Thread fpapon

Hi Justine,

I'm not sure to see the difference between "happy users" and vendors 
that advertise their products in some of the company list in the 
"powered by" page.


Btw, my initial purpose of my proposal was to help user to find support 
for production stuff rather than searching in google.


I don't think this is a bad thing because this is something that already 
exist in many ASF projects like:


https://hop.apache.org/community/commercial/
https://struts.apache.org/commercial-support.html
https://directory.apache.org/commercial-support.html
https://tomee.apache.org/commercial-support.html
https://plc4x.apache.org/users/commercial-support.html
https://camel.apache.org/community/support/
https://openmeetings.apache.org/commercial-support.html
https://guacamole.apache.org/support/
https://cwiki.apache.org/confluence/display/HADOOP2/Distributions+and+Commercial+Support
https://activemq.apache.org/supporthttps://karaf.apache.org/community.html
https://netbeans.apache.org/front/main/help/commercial-support/
https://royale.apache.org/royale-commercial-support/

https://karaf.apache.org/community.html

As I understand for now, the channel for users to find production 
support is:


- The mailing list (u...@kafka.apache.org / d...@kafka.apache.org)

- The official #kafka  ASF Slack channel (may be we can add it on the 
website because I didn't find it in the website => 
https://kafka.apache.org/contact)


- Search in google for commercial support only

I can update my PR to mention only the 3 points above for the "get 
support" page if people think that having a support page make sense.


regards,

François

On 11/01/2024 19:34, Justine Olshan wrote:

I think there is a difference between the "Powered by" page and a page for
vendors to advertise their products and services.

The idea is that the companies on that page are "powered by" Kafka. They
serve as examples of happy users of Kafka.
I don't think it is meant only as a place just for those companies to
advertise.

I'm a little confused by


In this case, I'm ok to say that the commercial support section in the

"Get support" is no need as we can use this page.

If you plan to submit for this page, please include a description on how
your company uses Kafka.

I'm happy to hear other folks' opinions on this page as well.

Thanks,
Justine



On Thu, Jan 11, 2024 at 8:57 AM fpapon  wrote:


Hi,

About the vendors list and neutrality, what is the policy of the
"Powered by" page?

https://kafka.apache.org/powered-by

We can see company with logo, some are talking about their product
(Agoora), some are offering services (Instaclustr, Aiven), and we can
also see some that just put their logo and a link to their website
without any explanation (GoldmanSachs).

So as I understand and after reading the text in the footer of this
page, every company can add themselves by providing a PR right?

"Want to appear on this page?
Submit a pull request or send a quick description of your organization
and usage to the mailing list and we'll add you."

In this case, I'm ok to say that the commercial support section in the
"Get support" is no need as we can use this page.

regards,

François


On 10/01/2024 19:03, Kenneth Eversole wrote:

I agree with Divji here and to be more pointed. I worry that if we go

down

the path of adding vendors to a list it comes off as supporting their
product, not to mention could be a huge security risk for novice users. I
would rather this be a callout to other purely open source tooling, such

as

cruise control.

Divji brings up good question
1.  What value does additional of this page bring to the users of Apache
Kafka?

I think the community would be a better service to have a more

synchronous

line of communication such as Slack/Discord and we call that out here. It
would be more inline with other major open source projects.

---
Kenneth Eversole

On Wed, Jan 10, 2024 at 10:30 AM Divij Vaidya 
wrote:


I don't see a need for this. What additional information does this

provide

over what can be found via a quick google search?

My primary concern is that we are getting in the business of listing
vendors in the project site which brings it's own complications without
adding much additional value for users. In the spirit of being vendor
neutral, I would try to avoid this as much as possible.

So, my question to you is:
1. What value does additional of this page bring to the users of Apache
Kafka?
2. When a new PR is submitted to add a vendor, what criteria do we have

to

decide whether to add them or not? If we keep a blanket criteria of
accepting all PRs, then we may end up in a situation where the llink
redirects to a phishing page or nefarious website. Hence, we might have

to

at least perform some basic due diligence which adds overhead to the
resources of the community.

--
Divij Vaidya



On Wed, Jan 10, 2024 at 5:00 PM fpapon  wrote:


Hi,

After starting a first thread on this topic (

Re: Mirror Maker bidirectional offset sync

2024-01-11 Thread Greg Harris
Hey Jeroen,

Thanks for sharing your prototype! It is very interesting!

> I couldn't reproduce your hypothesis.

I think my hypothesis was for another setup which didn't involve code
changes, and instead relied on A->B->A round trip replication to
produce the "backwards" offset syncs.
I believe this would replicate data from "replicate-me-0" to
"b.a.replicate-me-0", and then possibly take the offsets intended for
"b.a.replicate-me-0" and apply them to "replicate-me-0" creating the
infinite cycle.
I would not expect your implementation to suffer from this failure
mode, because it's using the offset in "replicate-me-0" as the
downstream offset, not the offset of "b.a.replicate-me-0".

With your prototype, do you experience "collisions" in the
offset-syncs topic? Since you're sharing a single offset-syncs topic
between both replication flows, I would expect offsets for topics with
the same names on both clusters to conflict, and cause the translation
to happen using the opposite topic's offsets.
It would also be visible in the state of the OffsetSyncStore here:
[1], you can compare the normal A->B behavior before and after
starting the B -> A source connector to see if the concurrent flows
causes more syncs to be cleared, or the wrong syncs to be present.

I think it is normal for every MM2 connector to have the same
offset-syncs.topic.location to avoid these sorts of conflicts, so that
each syncs topic is only used by one of the MM2 replication flows.
I think that turning on bidirectional offset syncs will probably
require a second producer in the MirrorSourceTask to contact the
opposite cluster, or a second admin client in the
MirrorCheckpointTask.

> Do you think it'd be worthwhile proceeding with this?

This is certainly a capability that MM2 is missing right now, and
seems like it would be a natural component of failing consumers back
and forth. If you see value in it, and are interested in driving the
feature, you can open a KIP [2] to discuss the interface and design
with the rest of the community.

Thanks!
Greg

[1] 
https://github.com/apache/kafka/blob/2c6fb6c54472e90ae17439e62540ef3cb0426fe3/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/OffsetSyncStore.java#L194
[2] 
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals

On Thu, Jan 11, 2024 at 9:27 AM Jeroen Schutrup
 wrote:
>
> I see, makes complete sense to me. I built a custom version [1] based off
> of Kafka 3.5.1 with bidirectional offset replication enabled so I could do
> some more testing. Offset translation back upstream works well; I think
> because of the reason Ryanne pointed out, both topics contain identical
> data. Tested this by truncating the upstream topic before starting
> replication (so the downstream/upstream topics have different offsets).
> Truncating the upstream topic while replication is running neither results
> in any weirdness.
>
> > Before starting the replication, insert a few records into
> `a.replicate-me` to force replicate-me-0's offset n to replicate to
> a.replicate-me-0's offset n+k.
> I couldn't reproduce your hypothesis. After doing the above and then
> starting replication I didn't see any offset replication loops. Once I
> started producing data into the upstream topic and subscribing a
> console-consumer on the downstream topic, offsets were translated and
> replicated correctly back upstream. My guess is the CheckpointConnector can
> offset these surplus of messages as the actual log offsets of the
> downstream topic are written to the offset-sync topic.
>
> As this kind of active/active replication would be very beneficial to us
> for reasons stated in my previous message, we'd love to help out building
> this kind of offset replication into the Mirror connectors. I understand
> this is not something that should be enabled by default, but having it
> behind configuration toggle could help out users desiring a similar kind of
> active/active setup and who understand the restrictions. Do you think it'd
> be worthwhile proceeding with this?
>
> [1]
> https://github.com/jeroen92/kafka/commit/1a27696ec6777c230f100cf9887368c431ebe0f8
>
> On Thu, Jan 11, 2024 at 1:06 AM Greg Harris 
> wrote:
>
> > Hi Jeroen,
> >
> > I'm glad you're experimenting with MM2, and I hope we can give you
> > some more context to explain what you're seeing.
> >
> > > I wrote a small program to produce these offset syncs for the prefixed
> > > topic, and this successfully triggers the Checkpoint connector to start
> > > replicating the consumer offsets back to the primary cluster.
> >
> > This is interesting, and I wouldn't have expected it to work.
> >
> > To rewind, each flow Source->Target has a MirrorSourceConnector, an
> > Offset Syncs Topic, and a MirrorCheckpointConnector. With both
> > directions enabled, there are two separate flows each with Source,
> > Syncs topic, and Checkpoint.
> > With offset-syncs.topic.location=source, the
> > mm2-offset-syncs.b.internal on the A cluster is 

[VOTE] 3.7.0 RC2

2024-01-11 Thread Stanislav Kozlovski
Hello Kafka users, developers, and client-developers,

This is the first candidate for release of Apache Kafka 3.7.0.

Note it's named "RC2" because I had a few "failed" RCs that I had
cut/uploaded but ultimately had to scrap prior to announcing due to new
blockers arriving before I could even announce them.

Further - I haven't yet been able to set up the system tests successfully.
And the integration/unit tests do have a few failures that I have to spend
time triaging. I would appreciate any help in case anyone notices any tests
failing that they're subject matters experts in. Expect me to follow up in
a day or two with more detailed analysis.

Major changes include:
- Early Access to KIP-848 - the next generation of the consumer rebalance
protocol
- KIP-858: Adding JBOD support to KRaft
- KIP-714: Observability into Client metrics via a standardized interface

Check more information in the WIP blog post:
https://github.com/apache/kafka-site/pull/578

Release notes for the 3.7.0 release:
https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/RELEASE_NOTES.html

*** Please download, test and vote by Thursday, January 18, 9am PT ***

Usually these deadlines tend to be 2-3 days, but due to this being the
first RC and the tests not having ran yet, I am giving it a bit more time.

Kafka's KEYS file containing PGP keys we use to sign the release:
https://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/

* Docker release artifact to be voted upon:
apache/kafka:3.7.0-rc2

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

* Javadoc:
https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/javadoc/

* Tag to be voted upon (off 3.7 branch) is the 3.7.0 tag:
https://github.com/apache/kafka/releases/tag/3.7.0-rc2

* Documentation:
https://kafka.apache.org/37/documentation.html

* Protocol:
https://kafka.apache.org/37/protocol.html

* Successful Jenkins builds for the 3.7 branch:
Unit/integration tests:
https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.7/58/
There are failing tests here. I have to follow up with triaging some of the
failures and figuring out if they're actual problems or simply flakes.

System tests: https://jenkins.confluent.io/job/system-test-kafka/job/3.7/

No successful system test runs yet. I am working on getting the job to run.

* Successful Docker Image Github Actions Pipeline for 3.7 branch:
Attached are the scan_report and report_jvm output files from the Docker
Build run:
https://github.com/apache/kafka/actions/runs/7486094960/job/20375761673

And the final docker image build job - Docker Build Test Pipeline:
https://github.com/apache/kafka/actions/runs/7486178277

The image is apache/kafka:3.7.0-rc2 -
https://hub.docker.com/layers/apache/kafka/3.7.0-rc2/images/sha256-5b4707c08170d39549fbb6e2a3dbb83936a50f987c0c097f23cb26b4c210c226?context=explore

/**

Thanks,
Stanislav Kozlovski

kafka/test:test (alpine 3.18.5)
===
Total: 0 (HIGH: 0, CRITICAL: 0)



Re: Mirror Maker bidirectional offset sync

2024-01-11 Thread Jeroen Schutrup
I see, makes complete sense to me. I built a custom version [1] based off
of Kafka 3.5.1 with bidirectional offset replication enabled so I could do
some more testing. Offset translation back upstream works well; I think
because of the reason Ryanne pointed out, both topics contain identical
data. Tested this by truncating the upstream topic before starting
replication (so the downstream/upstream topics have different offsets).
Truncating the upstream topic while replication is running neither results
in any weirdness.

> Before starting the replication, insert a few records into
`a.replicate-me` to force replicate-me-0's offset n to replicate to
a.replicate-me-0's offset n+k.
I couldn't reproduce your hypothesis. After doing the above and then
starting replication I didn't see any offset replication loops. Once I
started producing data into the upstream topic and subscribing a
console-consumer on the downstream topic, offsets were translated and
replicated correctly back upstream. My guess is the CheckpointConnector can
offset these surplus of messages as the actual log offsets of the
downstream topic are written to the offset-sync topic.

As this kind of active/active replication would be very beneficial to us
for reasons stated in my previous message, we'd love to help out building
this kind of offset replication into the Mirror connectors. I understand
this is not something that should be enabled by default, but having it
behind configuration toggle could help out users desiring a similar kind of
active/active setup and who understand the restrictions. Do you think it'd
be worthwhile proceeding with this?

[1]
https://github.com/jeroen92/kafka/commit/1a27696ec6777c230f100cf9887368c431ebe0f8

On Thu, Jan 11, 2024 at 1:06 AM Greg Harris 
wrote:

> Hi Jeroen,
>
> I'm glad you're experimenting with MM2, and I hope we can give you
> some more context to explain what you're seeing.
>
> > I wrote a small program to produce these offset syncs for the prefixed
> > topic, and this successfully triggers the Checkpoint connector to start
> > replicating the consumer offsets back to the primary cluster.
>
> This is interesting, and I wouldn't have expected it to work.
>
> To rewind, each flow Source->Target has a MirrorSourceConnector, an
> Offset Syncs Topic, and a MirrorCheckpointConnector. With both
> directions enabled, there are two separate flows each with Source,
> Syncs topic, and Checkpoint.
> With offset-syncs.topic.location=source, the
> mm2-offset-syncs.b.internal on the A cluster is used for the A -> B
> replication flow. It contains topic names from cluster A, and the
> corresponding offsets those records were written to on the B cluster.
> When translation is performed, the consumer groups from A are
> replicated to the B cluster, and the replication mapping (cluster
> prefix) is added.
> Using your syncs topic as an example,
> OffsetSync{topicPartition=replicate-me-0, upstreamOffset=28,
> downstreamOffset=28} will be used to write offsets for
> "a.replicate-me-0" for the equivalent group on the B cluster.
>
> When your artificial sync OffsetSync{topicPartition=a.replicate-me-0,
> upstreamOffset=29, downstreamOffset=29} is processed, it should be
> used to write offsets for "a.a.replicate-me-0" but it actually writes
> offsets to "replicate-me-0" due to this function that I've never
> encountered before: [1].
> I think you could get those sorts of syncs into the syncs-topic if you
> had A->B configured with offset-syncs.topic.location=source, and B->A
> with offset-syncs-topic.location=target, and configured the topic
> filter to do A -> B -> A round trip replication.
>
> This appears to work as expected if there are no failures or restarts,
> but as soon as a record is re-delivered in either flow, I think the
> offsets should end up constantly advancing in an infinite loop. Maybe
> you can try that: Before starting the replication, insert a few
> records into `a.replicate-me` to force replicate-me-0's offset n to
> replicate to a.replicate-me-0's offset n+k.
>
> Ryanne, do you recall the purpose of the renameTopicPartition
> function? To me it looks like it could only be harmful, as it renames
> checkpoints to target topics that MirrorMaker2 isn't writing. It also
> looks like it isn't active in a typical MM2 setup.
>
> Thanks!
> Greg
>
> [1]:
> https://github.com/apache/kafka/blob/13a83d58f897de2f55d8d3342ffb058b230a9183/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorCheckpointTask.java#L257-L267
>
> On Tue, Jan 9, 2024 at 5:54 AM Jeroen Schutrup
>  wrote:
> >
> > Thank you both for your swift responses!
> >
> > Ryanne, the MirrorConnectorsIntegrationBaseTest only tests offset
> > replication in cases where the producer migrated to the secondary cluster
> > as well, starts feeding messages into the non-prefixed topic which are
> > subsequently consumed by the consumer on the secondary cluster. After the
> > fallback, it asserts the consumer offsets on the non-prefixed topic in

Re: [PROPOSAL] Add commercial support page on website

2024-01-11 Thread fpapon

Hi,

About the vendors list and neutrality, what is the policy of the 
"Powered by" page?


https://kafka.apache.org/powered-by

We can see company with logo, some are talking about their product 
(Agoora), some are offering services (Instaclustr, Aiven), and we can 
also see some that just put their logo and a link to their website 
without any explanation (GoldmanSachs).


So as I understand and after reading the text in the footer of this 
page, every company can add themselves by providing a PR right?


"Want to appear on this page?
Submit a pull request or send a quick description of your organization 
and usage to the mailing list and we'll add you."


In this case, I'm ok to say that the commercial support section in the 
"Get support" is no need as we can use this page.


regards,

François


On 10/01/2024 19:03, Kenneth Eversole wrote:

I agree with Divji here and to be more pointed. I worry that if we go down
the path of adding vendors to a list it comes off as supporting their
product, not to mention could be a huge security risk for novice users. I
would rather this be a callout to other purely open source tooling, such as
cruise control.

Divji brings up good question
1.  What value does additional of this page bring to the users of Apache
Kafka?

I think the community would be a better service to have a more synchronous
line of communication such as Slack/Discord and we call that out here. It
would be more inline with other major open source projects.

---
Kenneth Eversole

On Wed, Jan 10, 2024 at 10:30 AM Divij Vaidya 
wrote:


I don't see a need for this. What additional information does this provide
over what can be found via a quick google search?

My primary concern is that we are getting in the business of listing
vendors in the project site which brings it's own complications without
adding much additional value for users. In the spirit of being vendor
neutral, I would try to avoid this as much as possible.

So, my question to you is:
1. What value does additional of this page bring to the users of Apache
Kafka?
2. When a new PR is submitted to add a vendor, what criteria do we have to
decide whether to add them or not? If we keep a blanket criteria of
accepting all PRs, then we may end up in a situation where the llink
redirects to a phishing page or nefarious website. Hence, we might have to
at least perform some basic due diligence which adds overhead to the
resources of the community.

--
Divij Vaidya



On Wed, Jan 10, 2024 at 5:00 PM fpapon  wrote:


Hi,

After starting a first thread on this topic (
https://lists.apache.org/thread/kkox33rhtjcdr5zztq3lzj7c5s7k9wsr), I
would like to propose a PR:

https://github.com/apache/kafka-site/pull/577

The purpose of this proposal is to help users to find support for sla,
training, consulting...whatever that is not provide by the community as,
like we can already see in many ASF projects, no commercial support is
provided by the foundation. I think it could help with the adoption and

the

growth of the project because the users
need commercial support for production issues.

If the community is agree about this idea and want to move forward, I

just

add one company in the PR but everybody can add some by providing a new

PR

to complete the list. If people want me to add other you can reply to

this

thread because it will be better to have several company at the first
publication of the page.

Just provide the company-name and a short description of the service

offer

around Apache Kafka. The information must be factual and informational in
nature and not be a marketing statement.

regards,

François




--
--
François



Re: Mirror Maker bidirectional offset sync

2024-01-11 Thread Greg Harris
Hey Ryanne,

Thanks for the context, but I still don't see the situation where this
function is helpful.

Also "A's topic1 and B's a.topic1 should be the same data (minus
replication lag)." isn't true in the presence of failures/hard
restarts, compaction, and transaction markers.

Thanks,
Greg

On Wed, Jan 10, 2024 at 8:00 PM Ryanne Dolan  wrote:
>
> > do you recall the purpose of [...] renameTopicPartition [?]
>
> A's topic1 and B's a.topic1 should be the same data (minus replication
> lag). You can't consume a record in a.topic1 that hasn't been replicated
> yet -- a remote topic by definition does not have any records that MM2
> didn't put there. So an offset for a consumer consuming from B's a.topic1
> can be translated back to an offset in A's topic1, where the data came from.
>
> Ryanne
>
> On Wed, Jan 10, 2024, 6:07 PM Greg Harris 
> wrote:
>
> > Hi Jeroen,
> >
> > I'm glad you're experimenting with MM2, and I hope we can give you
> > some more context to explain what you're seeing.
> >
> > > I wrote a small program to produce these offset syncs for the prefixed
> > > topic, and this successfully triggers the Checkpoint connector to start
> > > replicating the consumer offsets back to the primary cluster.
> >
> > This is interesting, and I wouldn't have expected it to work.
> >
> > To rewind, each flow Source->Target has a MirrorSourceConnector, an
> > Offset Syncs Topic, and a MirrorCheckpointConnector. With both
> > directions enabled, there are two separate flows each with Source,
> > Syncs topic, and Checkpoint.
> > With offset-syncs.topic.location=source, the
> > mm2-offset-syncs.b.internal on the A cluster is used for the A -> B
> > replication flow. It contains topic names from cluster A, and the
> > corresponding offsets those records were written to on the B cluster.
> > When translation is performed, the consumer groups from A are
> > replicated to the B cluster, and the replication mapping (cluster
> > prefix) is added.
> > Using your syncs topic as an example,
> > OffsetSync{topicPartition=replicate-me-0, upstreamOffset=28,
> > downstreamOffset=28} will be used to write offsets for
> > "a.replicate-me-0" for the equivalent group on the B cluster.
> >
> > When your artificial sync OffsetSync{topicPartition=a.replicate-me-0,
> > upstreamOffset=29, downstreamOffset=29} is processed, it should be
> > used to write offsets for "a.a.replicate-me-0" but it actually writes
> > offsets to "replicate-me-0" due to this function that I've never
> > encountered before: [1].
> > I think you could get those sorts of syncs into the syncs-topic if you
> > had A->B configured with offset-syncs.topic.location=source, and B->A
> > with offset-syncs-topic.location=target, and configured the topic
> > filter to do A -> B -> A round trip replication.
> >
> > This appears to work as expected if there are no failures or restarts,
> > but as soon as a record is re-delivered in either flow, I think the
> > offsets should end up constantly advancing in an infinite loop. Maybe
> > you can try that: Before starting the replication, insert a few
> > records into `a.replicate-me` to force replicate-me-0's offset n to
> > replicate to a.replicate-me-0's offset n+k.
> >
> > Ryanne, do you recall the purpose of the renameTopicPartition
> > function? To me it looks like it could only be harmful, as it renames
> > checkpoints to target topics that MirrorMaker2 isn't writing. It also
> > looks like it isn't active in a typical MM2 setup.
> >
> > Thanks!
> > Greg
> >
> > [1]:
> > https://github.com/apache/kafka/blob/13a83d58f897de2f55d8d3342ffb058b230a9183/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorCheckpointTask.java#L257-L267
> >
> > On Tue, Jan 9, 2024 at 5:54 AM Jeroen Schutrup
> >  wrote:
> > >
> > > Thank you both for your swift responses!
> > >
> > > Ryanne, the MirrorConnectorsIntegrationBaseTest only tests offset
> > > replication in cases where the producer migrated to the secondary cluster
> > > as well, starts feeding messages into the non-prefixed topic which are
> > > subsequently consumed by the consumer on the secondary cluster. After the
> > > fallback, it asserts the consumer offsets on the non-prefixed topic in
> > the
> > > secondary cluster are translated and replicated to the consumer offsets
> > of
> > > the prefixed topic in the primary cluster.
> > > In my example, the producer keeps producing in the primary cluster
> > whereas
> > > only the consumer fails over to the secondary cluster and, after some
> > time
> > > fails back to the primary cluster. This consumer will then consume
> > messages
> > > from the prefixed topic in the secondary cluster, and I'd like to have
> > > those offsets replicated back to the non-prefixed topic in the primary
> > > cluster. If you like I can provide an illustration if that helps to
> > clarify
> > > this use case.
> > >
> > > To add some context on why I'd like to have this is to retain loose
> > > coupling between producers and consumers 

Re: [PROPOSAL] Add commercial support page on website

2024-01-11 Thread Kenneth Eversole
I agree with Divji here and to be more pointed. I worry that if we go down
the path of adding vendors to a list it comes off as supporting their
product, not to mention could be a huge security risk for novice users. I
would rather this be a callout to other purely open source tooling, such as
cruise control.

Divji brings up good question
1.  What value does additional of this page bring to the users of Apache
Kafka?

I think the community would be a better service to have a more synchronous
line of communication such as Slack/Discord and we call that out here. It
would be more inline with other major open source projects.

---
Kenneth Eversole

On Wed, Jan 10, 2024 at 10:30 AM Divij Vaidya 
wrote:

> I don't see a need for this. What additional information does this provide
> over what can be found via a quick google search?
>
> My primary concern is that we are getting in the business of listing
> vendors in the project site which brings it's own complications without
> adding much additional value for users. In the spirit of being vendor
> neutral, I would try to avoid this as much as possible.
>
> So, my question to you is:
> 1. What value does additional of this page bring to the users of Apache
> Kafka?
> 2. When a new PR is submitted to add a vendor, what criteria do we have to
> decide whether to add them or not? If we keep a blanket criteria of
> accepting all PRs, then we may end up in a situation where the llink
> redirects to a phishing page or nefarious website. Hence, we might have to
> at least perform some basic due diligence which adds overhead to the
> resources of the community.
>
> --
> Divij Vaidya
>
>
>
> On Wed, Jan 10, 2024 at 5:00 PM fpapon  wrote:
>
> > Hi,
> >
> > After starting a first thread on this topic (
> > https://lists.apache.org/thread/kkox33rhtjcdr5zztq3lzj7c5s7k9wsr), I
> > would like to propose a PR:
> >
> > https://github.com/apache/kafka-site/pull/577
> >
> > The purpose of this proposal is to help users to find support for sla,
> > training, consulting...whatever that is not provide by the community as,
> > like we can already see in many ASF projects, no commercial support is
> > provided by the foundation. I think it could help with the adoption and
> the
> > growth of the project because the users
> > need commercial support for production issues.
> >
> > If the community is agree about this idea and want to move forward, I
> just
> > add one company in the PR but everybody can add some by providing a new
> PR
> > to complete the list. If people want me to add other you can reply to
> this
> > thread because it will be better to have several company at the first
> > publication of the page.
> >
> > Just provide the company-name and a short description of the service
> offer
> > around Apache Kafka. The information must be factual and informational in
> > nature and not be a marketing statement.
> >
> > regards,
> >
> > François
> >
> >
> >
>


KAfka and syslog

2024-01-11 Thread Francisco Serrano
Hello, Is possble integrate one network devices to send syslog messages to 
Kafka ?

During cluster peak, KAFKA NetworkProcessorAvgIdlePercent is lower than 0.2

2024-01-11 Thread dong yu
This is my problem
1.The request queue is always at 500
2.NetworkProcessorAvgIdlePercent is lower than 0.2


This is my BROKER configuration
num.io.threads=32
num.network.threads=64

How have I identified the cause and how to optimize my KAFKA cluster?
THKS。


Re: [PROPOSAL] Add commercial support page on website

2024-01-11 Thread fpapon

Hi,

The purpose is not to mention or list vendor, it's not a page to list 
product based on Apache Kafka. The purpose is to list companies that 
offer support for production, training or consulting only on Apache Kafka.


It's a common use case where users are looking for a commercial support 
and this is something that the ASF doesn't provide so it's fair for 
companies to propose offers to cover this use case. It's fair and help 
the adoption of the project, like we can see in a lot of ASF projects 
and I think it's better for a user to have a list on the website rather 
than searching in google.


There is a mention in the text:

"The information must be factual and informational in nature and not be 
a marketing statement.
  Statements that promote your products and services over other 
offerings on the page will not be tolerated and
  will be removed. Such marketing statements can be added to your own 
pages on your own site, but not here."


About the "phishing page" it's about the committer to check that the PR 
to add a website is not a phishing and I'm not sure that their will be a 
lot of these kind of PR.


regards,

François

On 10/01/2024 17:29, Divij Vaidya wrote:

I don't see a need for this. What additional information does this provide
over what can be found via a quick google search?

My primary concern is that we are getting in the business of listing
vendors in the project site which brings it's own complications without
adding much additional value for users. In the spirit of being vendor
neutral, I would try to avoid this as much as possible.

So, my question to you is:
1. What value does additional of this page bring to the users of Apache
Kafka?
2. When a new PR is submitted to add a vendor, what criteria do we have to
decide whether to add them or not? If we keep a blanket criteria of
accepting all PRs, then we may end up in a situation where the llink
redirects to a phishing page or nefarious website. Hence, we might have to
at least perform some basic due diligence which adds overhead to the
resources of the community.

--
Divij Vaidya



On Wed, Jan 10, 2024 at 5:00 PM fpapon  wrote:


Hi,

After starting a first thread on this topic (
https://lists.apache.org/thread/kkox33rhtjcdr5zztq3lzj7c5s7k9wsr), I
would like to propose a PR:

https://github.com/apache/kafka-site/pull/577

The purpose of this proposal is to help users to find support for sla,
training, consulting...whatever that is not provide by the community as,
like we can already see in many ASF projects, no commercial support is
provided by the foundation. I think it could help with the adoption and the
growth of the project because the users
need commercial support for production issues.

If the community is agree about this idea and want to move forward, I just
add one company in the PR but everybody can add some by providing a new PR
to complete the list. If people want me to add other you can reply to this
thread because it will be better to have several company at the first
publication of the page.

Just provide the company-name and a short description of the service offer
around Apache Kafka. The information must be factual and informational in
nature and not be a marketing statement.

regards,

François




--
--
François