[ https://issues.apache.org/jira/browse/KAFKA-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navinder Brar updated KAFKA-6643: --------------------------------- Summary: Warm up new replicas from scratch when changelog topic has LIMITED retention time (was: Warm up new replicas from scratch when changelog topic has retention time) > Warm up new replicas from scratch when changelog topic has LIMITED retention > time > --------------------------------------------------------------------------------- > > Key: KAFKA-6643 > URL: https://issues.apache.org/jira/browse/KAFKA-6643 > Project: Kafka > Issue Type: New Feature > Components: streams > Reporter: Navinder Brar > Priority: Major > > In the current scenario, Kafka Streams has changelog Kafka topics(internal > topics having all the data for the store) which are used to build the state > of replicas. So, if we keep the number of standby replicas as 1, we still > have more availability for persistent state stores as changelog Kafka topics > are also replicated depending upon broker replication policy but that also > means we are using at least 4 times the space(1 master store, 1 replica > store, 1 changelog, 1 changelog replica). > Now if we have an year's data in persistent stores(rocksdb), we don't want > the changelog topics to have an year's data as it will put an unnecessary > burden on brokers(in terms of space). If we have to scale our kafka streams > application(having 200-300 TB's of data) we have to scale the kafka brokers > as well. We want to reduce this dependency and find out ways to just use > changelog topic as a queue, having just 2 or 3 days of data and warm up the > replicas from scratch in some other way. > I have few proposals in that respect. > 1. Use a new kafka topic related to each partition whi -- This message was sent by Atlassian JIRA (v7.6.3#76005)