[ 
https://issues.apache.org/jira/browse/KAFKA-8201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815156#comment-16815156
 ] 

Anders Aagaard commented on KAFKA-8201:
---------------------------------------

Ah, that makes sense. Thanks for the extra info, I couldn't find that anywhere!

> Kafka streams repartitioning topic settings crashing multiple nodes
> -------------------------------------------------------------------
>
>                 Key: KAFKA-8201
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8201
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.0.0
>            Reporter: Anders Aagaard
>            Priority: Major
>
> We had an incident in a setup using kafka streams version 2.0.0 and kafka 
> version 2.0.0 protocol version 2.0-IV1. The reason for it is a combination of 
> kafka streams defaults and a bug in kafka.
> Info about the setup: Streams application reading a log compacted input 
> topic, and performing a groupby operation requiring repartitioning.
> Kafka streams automatically creates a repartitioning topic with 24 partitions 
> and the following options:
> segment.bytes=52428800, retention.ms=9223372036854775807, 
> segment.index.bytes=52428800, cleanup.policy=delete, segment.ms=600000.
>  
> This should mean we roll out a new segment when the active one reaches 50mb 
> or is older than 10 mniutes. However, the different timestamps coming into 
> the topic due to log compaction (sometimes varying in multiple days) means 
> the server will see a message which is older than segments.ms and 
> automatically trigger a new segment roll out. This causes a segment 
> explosion. Where new segments are continuously rolled out.
> There seems to be a bug report for this server side here : 
> https://issues.apache.org/jira/browse/KAFKA-4336.
> This effectively took down several nodes and a broker in our cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to