[ 
https://issues.apache.org/jira/browse/KAFKA-19225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christo Lolov updated KAFKA-19225:
----------------------------------
    Fix Version/s:     (was: 4.0.1)

> Tiered Storage Support for Active Log Segment
> ---------------------------------------------
>
>                 Key: KAFKA-19225
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19225
>             Project: Kafka
>          Issue Type: New Feature
>          Components: Tiered-Storage
>    Affects Versions: 4.0.0
>            Reporter: Henry Cai
>            Assignee: Henry Cai
>            Priority: Major
>
> This is the Jira for 
> [KIP-1176|https://cwiki.apache.org/confluence/display/KAFKA/KIP-1176%3A+Tiered+Storage+for+Active+Log+Segment]
> In KIP-405, the community has proposed and implemented the tiered storage for 
> old Kafka log segment files, when the log segments is older than 
> {_}local.retention.ms{_}, it becomes eligible to be uploaded to cloud's 
> object storage and removed from the local storage thus reducing local storage 
> cost.  KIP-405 only uploads older log segments but not the most recent active 
> log segments (write-ahead logs). Thus in a typical 3-way replicated Kafka 
> cluster, the 2 follower brokers would still need to replicate the active log 
> segments from the leader broker. It is common practice to set up the 3 
> brokers in three different AZs to improve the high availability of the 
> cluster. This would cause the replications between leader/follower brokers to 
> be across AZs which is a significant cost ([various 
> studies|https://www.confluent.io/blog/understanding-and-optimizing-your-kafka-costs-part-1-infrastructure/]
>  show the across AZ transfer cost typically comprises 50%-60% of the total 
> cluster cost). Since all the active log segments are physically present on 
> three Kafka Brokers, they still comprise significant resource usage on the 
> brokers. The state of the broker is still quite big during node replacement, 
> leading to longer node replacement time. 
> [KIP-1150|https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics]
>  recently proposes diskless Kafka topic, but leads to increased latency and a 
> significant redesign. In comparison, this proposed KIP maintains identical 
> performance for acks=1 producer path, minimizes design changes to Kafka, and 
> still slashes cost by an estimated 43%.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to