Re: Mirror Maker bidirectional offset sync

2024-01-10 Thread Ryanne Dolan
> do you recall the purpose of [...] renameTopicPartition [?]

A's topic1 and B's a.topic1 should be the same data (minus replication
lag). You can't consume a record in a.topic1 that hasn't been replicated
yet -- a remote topic by definition does not have any records that MM2
didn't put there. So an offset for a consumer consuming from B's a.topic1
can be translated back to an offset in A's topic1, where the data came from.

Ryanne

On Wed, Jan 10, 2024, 6:07 PM Greg Harris 
wrote:

> Hi Jeroen,
>
> I'm glad you're experimenting with MM2, and I hope we can give you
> some more context to explain what you're seeing.
>
> > I wrote a small program to produce these offset syncs for the prefixed
> > topic, and this successfully triggers the Checkpoint connector to start
> > replicating the consumer offsets back to the primary cluster.
>
> This is interesting, and I wouldn't have expected it to work.
>
> To rewind, each flow Source->Target has a MirrorSourceConnector, an
> Offset Syncs Topic, and a MirrorCheckpointConnector. With both
> directions enabled, there are two separate flows each with Source,
> Syncs topic, and Checkpoint.
> With offset-syncs.topic.location=source, the
> mm2-offset-syncs.b.internal on the A cluster is used for the A -> B
> replication flow. It contains topic names from cluster A, and the
> corresponding offsets those records were written to on the B cluster.
> When translation is performed, the consumer groups from A are
> replicated to the B cluster, and the replication mapping (cluster
> prefix) is added.
> Using your syncs topic as an example,
> OffsetSync{topicPartition=replicate-me-0, upstreamOffset=28,
> downstreamOffset=28} will be used to write offsets for
> "a.replicate-me-0" for the equivalent group on the B cluster.
>
> When your artificial sync OffsetSync{topicPartition=a.replicate-me-0,
> upstreamOffset=29, downstreamOffset=29} is processed, it should be
> used to write offsets for "a.a.replicate-me-0" but it actually writes
> offsets to "replicate-me-0" due to this function that I've never
> encountered before: [1].
> I think you could get those sorts of syncs into the syncs-topic if you
> had A->B configured with offset-syncs.topic.location=source, and B->A
> with offset-syncs-topic.location=target, and configured the topic
> filter to do A -> B -> A round trip replication.
>
> This appears to work as expected if there are no failures or restarts,
> but as soon as a record is re-delivered in either flow, I think the
> offsets should end up constantly advancing in an infinite loop. Maybe
> you can try that: Before starting the replication, insert a few
> records into `a.replicate-me` to force replicate-me-0's offset n to
> replicate to a.replicate-me-0's offset n+k.
>
> Ryanne, do you recall the purpose of the renameTopicPartition
> function? To me it looks like it could only be harmful, as it renames
> checkpoints to target topics that MirrorMaker2 isn't writing. It also
> looks like it isn't active in a typical MM2 setup.
>
> Thanks!
> Greg
>
> [1]:
> https://github.com/apache/kafka/blob/13a83d58f897de2f55d8d3342ffb058b230a9183/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorCheckpointTask.java#L257-L267
>
> On Tue, Jan 9, 2024 at 5:54 AM Jeroen Schutrup
>  wrote:
> >
> > Thank you both for your swift responses!
> >
> > Ryanne, the MirrorConnectorsIntegrationBaseTest only tests offset
> > replication in cases where the producer migrated to the secondary cluster
> > as well, starts feeding messages into the non-prefixed topic which are
> > subsequently consumed by the consumer on the secondary cluster. After the
> > fallback, it asserts the consumer offsets on the non-prefixed topic in
> the
> > secondary cluster are translated and replicated to the consumer offsets
> of
> > the prefixed topic in the primary cluster.
> > In my example, the producer keeps producing in the primary cluster
> whereas
> > only the consumer fails over to the secondary cluster and, after some
> time
> > fails back to the primary cluster. This consumer will then consume
> messages
> > from the prefixed topic in the secondary cluster, and I'd like to have
> > those offsets replicated back to the non-prefixed topic in the primary
> > cluster. If you like I can provide an illustration if that helps to
> clarify
> > this use case.
> >
> > To add some context on why I'd like to have this is to retain loose
> > coupling between producers and consumers so we're able to test failovers
> > for individual applications without the need for all producers/consumers
> to
> > failover and failback at once.
> >
> > Digging through the Connect debug logs I found the offset-syncs of the
> > prefixed topic not being pushed to mm2-offset-syncs.b.internal is likely
> > the reason the checkpoint connector doesn't replicate consumer offsets:
> > DEBUG translateDownstream(replication,a.replicate-me-0,25): Skipped
> (offset
> > sync not found) (org.apache.kafka.connect.mirror.OffsetSyncStore)
> >
> 

Re: Mirror Maker bidirectional offset sync

2024-01-10 Thread Greg Harris
Hi Jeroen,

I'm glad you're experimenting with MM2, and I hope we can give you
some more context to explain what you're seeing.

> I wrote a small program to produce these offset syncs for the prefixed
> topic, and this successfully triggers the Checkpoint connector to start
> replicating the consumer offsets back to the primary cluster.

This is interesting, and I wouldn't have expected it to work.

To rewind, each flow Source->Target has a MirrorSourceConnector, an
Offset Syncs Topic, and a MirrorCheckpointConnector. With both
directions enabled, there are two separate flows each with Source,
Syncs topic, and Checkpoint.
With offset-syncs.topic.location=source, the
mm2-offset-syncs.b.internal on the A cluster is used for the A -> B
replication flow. It contains topic names from cluster A, and the
corresponding offsets those records were written to on the B cluster.
When translation is performed, the consumer groups from A are
replicated to the B cluster, and the replication mapping (cluster
prefix) is added.
Using your syncs topic as an example,
OffsetSync{topicPartition=replicate-me-0, upstreamOffset=28,
downstreamOffset=28} will be used to write offsets for
"a.replicate-me-0" for the equivalent group on the B cluster.

When your artificial sync OffsetSync{topicPartition=a.replicate-me-0,
upstreamOffset=29, downstreamOffset=29} is processed, it should be
used to write offsets for "a.a.replicate-me-0" but it actually writes
offsets to "replicate-me-0" due to this function that I've never
encountered before: [1].
I think you could get those sorts of syncs into the syncs-topic if you
had A->B configured with offset-syncs.topic.location=source, and B->A
with offset-syncs-topic.location=target, and configured the topic
filter to do A -> B -> A round trip replication.

This appears to work as expected if there are no failures or restarts,
but as soon as a record is re-delivered in either flow, I think the
offsets should end up constantly advancing in an infinite loop. Maybe
you can try that: Before starting the replication, insert a few
records into `a.replicate-me` to force replicate-me-0's offset n to
replicate to a.replicate-me-0's offset n+k.

Ryanne, do you recall the purpose of the renameTopicPartition
function? To me it looks like it could only be harmful, as it renames
checkpoints to target topics that MirrorMaker2 isn't writing. It also
looks like it isn't active in a typical MM2 setup.

Thanks!
Greg

[1]: 
https://github.com/apache/kafka/blob/13a83d58f897de2f55d8d3342ffb058b230a9183/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorCheckpointTask.java#L257-L267

On Tue, Jan 9, 2024 at 5:54 AM Jeroen Schutrup
 wrote:
>
> Thank you both for your swift responses!
>
> Ryanne, the MirrorConnectorsIntegrationBaseTest only tests offset
> replication in cases where the producer migrated to the secondary cluster
> as well, starts feeding messages into the non-prefixed topic which are
> subsequently consumed by the consumer on the secondary cluster. After the
> fallback, it asserts the consumer offsets on the non-prefixed topic in the
> secondary cluster are translated and replicated to the consumer offsets of
> the prefixed topic in the primary cluster.
> In my example, the producer keeps producing in the primary cluster whereas
> only the consumer fails over to the secondary cluster and, after some time
> fails back to the primary cluster. This consumer will then consume messages
> from the prefixed topic in the secondary cluster, and I'd like to have
> those offsets replicated back to the non-prefixed topic in the primary
> cluster. If you like I can provide an illustration if that helps to clarify
> this use case.
>
> To add some context on why I'd like to have this is to retain loose
> coupling between producers and consumers so we're able to test failovers
> for individual applications without the need for all producers/consumers to
> failover and failback at once.
>
> Digging through the Connect debug logs I found the offset-syncs of the
> prefixed topic not being pushed to mm2-offset-syncs.b.internal is likely
> the reason the checkpoint connector doesn't replicate consumer offsets:
> DEBUG translateDownstream(replication,a.replicate-me-0,25): Skipped (offset
> sync not found) (org.apache.kafka.connect.mirror.OffsetSyncStore)
>
> I wrote a small program to produce these offset syncs for the prefixed
> topic, and this successfully triggers the Checkpoint connector to start
> replicating the consumer offsets back to the primary cluster.
> OffsetSync{topicPartition=replicate-me-0, upstreamOffset=28,
> downstreamOffset=28}
> OffsetSync{topicPartition=replicate-me-0, upstreamOffset=29,
> downstreamOffset=29}
> OffsetSync{topicPartition=a.replicate-me-0, upstreamOffset=29,
> downstreamOffset=29} <-- the artificially generated offset-sync
>
> At this point it goes a bit beyond my understanding of the MM2 internals of
> whether this is a wise thing to do and if it would have any negative 

mTLS authentication for inter broker communication

2024-01-10 Thread Krishna Sai Veera Reddy
Hello All,

I am trying to set up Kafka(v3.6.1) to use mTLS
authentication/authorization for *inter broker* communication but currently
server.properties doesn't allow me to set a key store that contains client
private key/certificate. I did try supplying a keystore to Kafka that
contains both TLS server and client private key entries but that doesn't
seem to work.

Essentially what I am looking for is:
```java
listeners=BROKER://:21500,CONTROLLER://:21501
advertised.listeners=BROKER://localhost:21500
listener.security.protocol.map=BROKER:SSL,CONTROLLER:SSL

security.protocol=SSL
ssl.client.auth=required
ssl.enabled.protocols=TLSv1.2,TLSv1.3

ssl.keystore.type=JKS
ssl.keystore.password=changeit
ssl.keystore.location=/tmp/kafka/pki/server.jks

 Is something like this following possible?
<
ssl.client.keystore.type=JKS
ssl.client.keystore.password=changeit
ssl.keystore.location=/tmp/kafka/pki/client.jks
```

I am able to configure both the producer and consumer to use mTLS
authentication to talk to the Kafka cluster but for inter broker
communication I am not able to supply the broker TLS client certificates in
the properties. Any help on this is appreciated! Thank you in advance.

Regards,
Krishna V


Re: [PROPOSAL] Add commercial support page on website

2024-01-10 Thread Divij Vaidya
I don't see a need for this. What additional information does this provide
over what can be found via a quick google search?

My primary concern is that we are getting in the business of listing
vendors in the project site which brings it's own complications without
adding much additional value for users. In the spirit of being vendor
neutral, I would try to avoid this as much as possible.

So, my question to you is:
1. What value does additional of this page bring to the users of Apache
Kafka?
2. When a new PR is submitted to add a vendor, what criteria do we have to
decide whether to add them or not? If we keep a blanket criteria of
accepting all PRs, then we may end up in a situation where the llink
redirects to a phishing page or nefarious website. Hence, we might have to
at least perform some basic due diligence which adds overhead to the
resources of the community.

--
Divij Vaidya



On Wed, Jan 10, 2024 at 5:00 PM fpapon  wrote:

> Hi,
>
> After starting a first thread on this topic (
> https://lists.apache.org/thread/kkox33rhtjcdr5zztq3lzj7c5s7k9wsr), I
> would like to propose a PR:
>
> https://github.com/apache/kafka-site/pull/577
>
> The purpose of this proposal is to help users to find support for sla,
> training, consulting...whatever that is not provide by the community as,
> like we can already see in many ASF projects, no commercial support is
> provided by the foundation. I think it could help with the adoption and the
> growth of the project because the users
> need commercial support for production issues.
>
> If the community is agree about this idea and want to move forward, I just
> add one company in the PR but everybody can add some by providing a new PR
> to complete the list. If people want me to add other you can reply to this
> thread because it will be better to have several company at the first
> publication of the page.
>
> Just provide the company-name and a short description of the service offer
> around Apache Kafka. The information must be factual and informational in
> nature and not be a marketing statement.
>
> regards,
>
> François
>
>
>


[PROPOSAL] Add commercial support page on website

2024-01-10 Thread fpapon

Hi,

After starting a first thread on this topic 
(https://lists.apache.org/thread/kkox33rhtjcdr5zztq3lzj7c5s7k9wsr), I would 
like to propose a PR:

https://github.com/apache/kafka-site/pull/577

The purpose of this proposal is to help users to find support for sla, 
training, consulting...whatever that is not provide by the community as, like 
we can already see in many ASF projects, no commercial support is provided by 
the foundation. I think it could help with the adoption and the growth of the 
project because the users
need commercial support for production issues.

If the community is agree about this idea and want to move forward, I just add 
one company in the PR but everybody can add some by providing a new PR to 
complete the list. If people want me to add other you can reply to this thread 
because it will be better to have several company at the first publication of 
the page.

Just provide the company-name and a short description of the service offer 
around Apache Kafka. The information must be factual and informational in 
nature and not be a marketing statement.

regards,

François




Comparison between jbod vs single large disk for kafka broker

2024-01-10 Thread Jigar Shah
Hello,
Currently, I have a capacity issue with my current kafka cluster where in
the existing EBS volume is proving insufficient for the new load of
incoming data.
I need to find a solution to persist more data in my cluster and I see two
options
- one to vertically scale volume size
- and other to add a new volume and add it as additional directory
for the kafka log

Are there any drawbacks to using the JBOD approach over vertical scaling?


Thank you,
*Jigar*


Re: Commercial support

2024-01-10 Thread Francois Papon
Hi,

I worked on the website to add a commercial support page as discussed and here 
a starting PR:

https://github.com/apache/kafka-site/pull/577

For now I just add one company but everybody can add some by providing a PR. If 
people want me to add other you can reply to this thread because as mention in 
the comments of the PR, it's weird to have just one company at the start.

Just provide the company-name and a short description of the service offer 
around Apache Kafka. The information must be factual and informational in 
nature and not be a marketing statement.

The purpose is to help users to find support for production (sla), training, 
consulting...whatever that is not provide by the community as, like any ASF 
projects, no commercial support is provided by the foundation.

regards,

François

On 2022/09/28 10:04:19 fpapon wrote:
> Yes, exactly.
> 
> I can prepare a PR to add this page.
> 
> Regards,
> 
> Francois
> 
> On 28/09/2022 12:02, Bruno Cadonna wrote:
> > Hi,
> >
> > Ah, I see you were not looking for actual commercial support but 
> > rather for the page itself.
> >
> > Best,
> > Bruno
> >
> > On 28.09.22 11:31, Jean-Baptiste Onofré wrote:
> >> Hi,
> >>
> >> +1, yes it makes sense to me.
> >>
> >> Regards
> >> JB
> >>
> >> On Wed, Sep 28, 2022 at 11:26 AM fpapon  wrote:
> >>>
> >>> Hi Bruno,
> >>>
> >>> Thanks for your reply, I'm looking for a commercial support about
> >>> services consulting like we can have in others Apache project like:
> >>>
> >>> https://camel.apache.org/manual/commercial-camel-offerings.html
> >>>
> >>> https://activemq.apache.org/support
> >>>
> >>> I think it could be nice to add this on the Kafka website.
> >>>
> >>> Regards,
> >>>
> >>> Francois
> >>>
> >>> On 28/09/2022 11:05, Bruno Cadonna wrote:
>  Hi Francois,
> 
>  I am not aware of such a page on the Apache Kafka website.
> 
>  There are a variety of companies that sell Kafka as a self-hosted
>  platform or as a Cloud-hosted service.
> 
>  Those companies include Confluent (disclaimer: I work for them), Red
>  Hat, AWS, Aiven, Instaclustr, Cloudera, and more.
> 
> 
>  Best,
>  Bruno
> 
>  On 28.09.22 10:38, fpapon wrote:
> > Hi,
> >
> > I'm looking for a commercial support company page on the official
> > website (https://kafka.apache.org) but I cannot find one.
> >
> > Is such of page exist?
> >
> > Regards,
> >
> > Francois
> >
> >>> -- 
> >>> -- 
> >>> François
> >>>
> -- 
> --
> François
> 
>