[jira] [Commented] (KAFKA-16344) Internal topic mm2-offset-syncsinternal created with single partition is putting more load on the broker

2024-03-29 Thread Greg Harris (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832334#comment-17832334
 ] 

Greg Harris commented on KAFKA-16344:
-

[~janardhanag] I don't understand what you mean. Can you be more specific about 
what you tried, and what you observed?

> Internal topic mm2-offset-syncsinternal created with single 
> partition is putting more load on the broker
> -
>
> Key: KAFKA-16344
> URL: https://issues.apache.org/jira/browse/KAFKA-16344
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 3.5.1
>Reporter: Janardhana Gopalachar
>Priority: Major
>
> We are using Kafka 3.5.1 version, we see that the internal topic created by 
> mirrormaker 
> mm2-offset-syncsinternal is created with single partition due to 
> which the CPU load on the broker which will be leader for this partition is 
> increased compared to other brokers. Can multiple partitions be  created for 
> the topic so that the CPU load would get distributed 
>  
> Topic: mm2-offset-syncscluster-ainternal    TopicId: XRvTDbogT8ytNhqX2YTyrA   
>  PartitionCount: 1ReplicationFactor: 3    Configs: 
> min.insync.replicas=2,cleanup.policy=compact,message.format.version=3.0-IV1
>     Topic: mm2-offset-syncscluster-ainternal    Partition: 0    Leader: 2    
> Replicas: 2,1,0    Isr: 2,1,0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16344) Internal topic mm2-offset-syncsinternal created with single partition is putting more load on the broker

2024-03-28 Thread Janardhana Gopalachar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831700#comment-17831700
 ] 

Janardhana Gopalachar commented on KAFKA-16344:
---

Hi [~gharris1727] 

We tried to perform test in out local setup , we observed for the scenario

If messages were written in the Source Kafka  Cluster and the target Kafka 
cluster is created after a delay , then the no of messages written to 
mm2-offsetsyncsinternal  is doubled. Is this the expected behavior?

 

> Internal topic mm2-offset-syncsinternal created with single 
> partition is putting more load on the broker
> -
>
> Key: KAFKA-16344
> URL: https://issues.apache.org/jira/browse/KAFKA-16344
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 3.5.1
>Reporter: Janardhana Gopalachar
>Priority: Major
>
> We are using Kafka 3.5.1 version, we see that the internal topic created by 
> mirrormaker 
> mm2-offset-syncsinternal is created with single partition due to 
> which the CPU load on the broker which will be leader for this partition is 
> increased compared to other brokers. Can multiple partitions be  created for 
> the topic so that the CPU load would get distributed 
>  
> Topic: mm2-offset-syncscluster-ainternal    TopicId: XRvTDbogT8ytNhqX2YTyrA   
>  PartitionCount: 1ReplicationFactor: 3    Configs: 
> min.insync.replicas=2,cleanup.policy=compact,message.format.version=3.0-IV1
>     Topic: mm2-offset-syncscluster-ainternal    Partition: 0    Leader: 2    
> Replicas: 2,1,0    Isr: 2,1,0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16344) Internal topic mm2-offset-syncsinternal created with single partition is putting more load on the broker

2024-03-25 Thread Greg Harris (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830685#comment-17830685
 ] 

Greg Harris commented on KAFKA-16344:
-

Hi [~janardhanag] Yes, you should consider increasing offset.lag.max. 0 is best 
for precise offset translation, but can significantly increase the amount of 
traffic on the offset-syncs topic.

For topics without offset gaps (e.g. not compacted, not transactional) the 
offset.lag.max currently behaves as a ratio between the amount of records/sec 
in a mirrored topic, and that topic's additional load on the offset-syncs 
topic. There's a semaphore 
https://github.com/apache/kafka/blob/f8ce7feebcede6c6da93c649413a64695bc8d4c1/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorSourceTask.java#L95
 that prevents this load from slowing down the mirror source connector, but it 
won't protect the brokers holding the mm2 offsets topic from being overloaded.

For example, if you were to mirror 100 records, offset.lag.max=0 could 
write 100 records to the offset-syncs topic, generating huge load. For 
100 records with offset.lag.max=100, the number of offset syncs could be 
1, 100x less. For offset.lag.max=1, the number of offset syncs could be 
100, 1x less. To answer your questions:

> 1. What is the proportion of mm2-offsetsyncsinternal topic writes to overall 
>MM2 traffic?

Configurable with offset.lag.max for topics with contiguous offsets. For 
transactional topics it isn't configurable, as each transaction may cause 1 or 
2 offset syncs and that isn't limited by the offset.lag.max, so reducing the 
throughput may require increasing the transaction intervals on your source 
applications.

> 2. Is there a way to tune the internal topic writes with increased traffic 
> MM2?

As the throughput of your MM2 instance increases, you should decrease the 
offset.lag.max to keep the offset-syncs topic fixed.

> 3. I want to understand the expected mm2-offsetsyncsinternal write TPS for a 
> given amount of MM2 traffic. Are there any tunable parameters to reduce these 
> writes, and what are the consequences of tuning if any?

There isn't a closed formula, because a sync may be initiated by multiple 
conditions 
[https://github.com/apache/kafka/blob/f8ce7feebcede6c6da93c649413a64695bc8d4c1/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorSourceTask.java#L360-L363]
 and prevented by the semaphore when the producer latency becomes significant.

The consequence of decreasing throughput on this topic is less precise offset 
translation in the MirrorCheckpointConnector, and increased redelivery when 
failing over from source to target consumer offsets. If you aren't using this 
connector, then the offset syncs topic isn't read from at all. You currently 
can't turn the topic off, but there is a KIP open for that: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-1031%3A+Control+offset+translation+in+MirrorSourceConnector]
 . If for example you were satisfied with 30 seconds of redelivery for a topic 
which had 1000 records/second, then you could set offset.lag.max to 30*1000.

> 4. In a larger Kafka cluster, if a single broker is overloaded with 
> mm2-offsetsyncsinternal traffic, can it lead to a broker crash? Are there any 
> guidelines available for such scenarios? Currently, we have 6 brokers with 
> 24K MM2 traffic, and internal writes are at 10K, resulting in a 20% CPU 
> increase on one broker.

I'm not familiar with the failure conditions for overloaded brokers. If you are 
seeing failures then I would definitely recommend trying to tune MM2, or try to 
use quotas [https://kafka.apache.org/documentation/#design_quotas] to get this 
topic-partition under control.

> 5. Are there any limitations on Kafka brokers scaling , I mean how much kafka 
>broker can be expanded in single Kakfa cluster due to the 
>mm2-offsetsyncsinternal topic?

Because the offset syncs topic only supports a single partition, adding brokers 
will not solve this problem, other than having a broker dedicated to only 
serving this one topic. At some point, you would need two separate offset-syncs 
topics and separate MirrorSourceConnectors to shard the load.

> 6. How can the system be dimensioned to handle MM2 internal topic writes 
> effectively? Are there any recommended figures available? For instance, for a 
> given amount of traffic (X), what percentage increase in CPU (Y) should each 
> broker have to handle MM2 internal topic writes? Note that in other pods, 
> this resource may not be utilized.

The Kafka project doesn't publish "expected" dimensions, other than the default 
values of configurations. We don't publish performance analysis on any 
particular hardware setups, because the diversity of hardware running Kafka is 
just too vast to capture properly.

> Internal topic mm2-offset-syncsinternal created with single 
> partition is putting 

[jira] [Commented] (KAFKA-16344) Internal topic mm2-offset-syncsinternal created with single partition is putting more load on the broker

2024-03-07 Thread Janardhana Gopalachar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824385#comment-17824385
 ] 

Janardhana Gopalachar commented on KAFKA-16344:
---

HI [~gharris1727] 

Below are few questions we have 

1. What is the proportion of mm2-offsetsyncsinternal topic writes to overall 
MM2 traffic?

2. Is there a way to tune the internal topic writes with increased traffic MM2?

3. I want to understand the expected mm2-offsetsyncsinternal write TPS for a 
given amount of MM2 traffic. Are there any tunable parameters to reduce these 
writes, and what are the consequences of tuning if any?

4. In a larger Kafka cluster, if a single broker is overloaded with 
mm2-offsetsyncsinternal traffic, can it lead to a broker crash? Are there any 
guidelines available for such scenarios? Currently, we have 6 brokers with 24K 
MM2 traffic, and internal writes are at 10K, resulting in a 20% CPU increase on 
one broker.

5. Are there any limitations on Kafka brokers scaling , I mean how much kafka 
broker can be expanded in single Kakfa cluster due to the 
mm2-offsetsyncsinternal topic?

6. How can the system be dimensioned to handle MM2 internal topic writes 
effectively? Are there any recommended figures available? For instance, for a 
given amount of traffic (X), what percentage increase in CPU (Y) should each 
broker have to handle MM2 internal topic writes? Note that in other pods, this 
resource may not be utilized.

Regards

Jana

> Internal topic mm2-offset-syncsinternal created with single 
> partition is putting more load on the broker
> -
>
> Key: KAFKA-16344
> URL: https://issues.apache.org/jira/browse/KAFKA-16344
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 3.5.1
>Reporter: Janardhana Gopalachar
>Priority: Major
>
> We are using Kafka 3.5.1 version, we see that the internal topic created by 
> mirrormaker 
> mm2-offset-syncsinternal is created with single partition due to 
> which the CPU load on the broker which will be leader for this partition is 
> increased compared to other brokers. Can multiple partitions be  created for 
> the topic so that the CPU load would get distributed 
>  
> Topic: mm2-offset-syncscluster-ainternal    TopicId: XRvTDbogT8ytNhqX2YTyrA   
>  PartitionCount: 1ReplicationFactor: 3    Configs: 
> min.insync.replicas=2,cleanup.policy=compact,message.format.version=3.0-IV1
>     Topic: mm2-offset-syncscluster-ainternal    Partition: 0    Leader: 2    
> Replicas: 2,1,0    Isr: 2,1,0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16344) Internal topic mm2-offset-syncsinternal created with single partition is putting more load on the broker

2024-03-06 Thread Janardhana Gopalachar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823930#comment-17823930
 ] 

Janardhana Gopalachar commented on KAFKA-16344:
---

HI [~gharris1727] 

while processing 24k events/second, MM2 internal topic gets 10k events/sec. is 
the event load the internal topic is getting

Regards

Jana

> Internal topic mm2-offset-syncsinternal created with single 
> partition is putting more load on the broker
> -
>
> Key: KAFKA-16344
> URL: https://issues.apache.org/jira/browse/KAFKA-16344
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 3.5.1
>Reporter: Janardhana Gopalachar
>Priority: Major
>
> We are using Kafka 3.5.1 version, we see that the internal topic created by 
> mirrormaker 
> mm2-offset-syncsinternal is created with single partition due to 
> which the CPU load on the broker which will be leader for this partition is 
> increased compared to other brokers. Can multiple partitions be  created for 
> the topic so that the CPU load would get distributed 
>  
> Topic: mm2-offset-syncscluster-ainternal    TopicId: XRvTDbogT8ytNhqX2YTyrA   
>  PartitionCount: 1ReplicationFactor: 3    Configs: 
> min.insync.replicas=2,cleanup.policy=compact,message.format.version=3.0-IV1
>     Topic: mm2-offset-syncscluster-ainternal    Partition: 0    Leader: 2    
> Replicas: 2,1,0    Isr: 2,1,0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16344) Internal topic mm2-offset-syncsinternal created with single partition is putting more load on the broker

2024-03-05 Thread Greg Harris (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823817#comment-17823817
 ] 

Greg Harris commented on KAFKA-16344:
-

Hi [~janardhanag], thanks for the ticket.

At the current time, the offset syncs topic cannot have more than 1 partition. 
If more than one partition is present, the MirrorSourceTask will only write to 
partition 0, and the MirrorCheckpointTask will only read from partition 0. 
Changes to both of these would be necessary to support partitioning that topic, 
and would require a KIP.

For a workaround, are you able to increase your `offset.lag.max` from the 
default 100? This will make the offset translation less accurate, but can 
decrease the throughput on that topic.

Do you have some more statistics that confirm that this topic is too active? It 
should have just a fraction of the load of the other topics, so i'd be 
interested to know more about the scale you're working at.

Thanks,
Greg

> Internal topic mm2-offset-syncsinternal created with single 
> partition is putting more load on the broker
> -
>
> Key: KAFKA-16344
> URL: https://issues.apache.org/jira/browse/KAFKA-16344
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Affects Versions: 3.5.1
>Reporter: Janardhana Gopalachar
>Priority: Major
>
> We are using Kafka 3.5.1 version, we see that the internal topic created by 
> mirrormaker 
> mm2-offset-syncsinternal is created with single partition due to 
> which the CPU load on the broker which will be leader for this partition is 
> increased compared to other brokers. Can multiple partitions be  created for 
> the topic so that the CPU load would get distributed 
>  
> Topic: mm2-offset-syncscluster-ainternal    TopicId: XRvTDbogT8ytNhqX2YTyrA   
>  PartitionCount: 1ReplicationFactor: 3    Configs: 
> min.insync.replicas=2,cleanup.policy=compact,message.format.version=3.0-IV1
>     Topic: mm2-offset-syncscluster-ainternal    Partition: 0    Leader: 2    
> Replicas: 2,1,0    Isr: 2,1,0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)