[ 
https://issues.apache.org/jira/browse/KAFKA-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haifeng Chen updated KAFKA-20552:
---------------------------------
    Description: 
The {{log.segment.bytes}} broker config (and its topic-level synonym 
{{{}segment.bytes{}}}) is currently defined as {{{}ConfigDef.Type.INT{}}}, 
capping the maximum segment size at {{Integer.MAX_VALUE}} (2,147,483,647 bytes, 
~2 GB). Additionally, the {{.index}} file format stores physical file positions 
as 4-byte signed integers, which also cannot address beyond ~2 GB.

With modern storage hardware (multi-TB NVMe drives) and high-throughput 
workloads, the 2 GB cap is increasingly a problem:
 * {*}Excessive file handle usage{*}: Each segment needs 4 files ({{{}.log{}}}, 
{{{}.index{}}}, {{{}.timeindex{}}}, {{{}.txnindex{}}}). A 10 TB partition with 
2 GB segments means ~20,000 open files.
 * {*}Frequent segment rolls{*}: A topic ingesting 500 MB/s rolls a new segment 
every ~4 seconds, amplifying index build, flush, and cleaner overhead.
 * {*}More log cleaning / compaction work{*}: More segments means more 
compaction cycles with more small groups.
 * {*}Remote storage overhead{*}: Each segment is an individual unit for tiered 
storage copy/delete operations.

Allowing segments of 4 GB, 8 GB, or larger would significantly reduce these 
overheads for high-throughput, large-retention workloads.

KIP 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1333%3A+Support+log+segments+larger+than+2+GB

  was:
The {{log.segment.bytes}} broker config (and its topic-level synonym 
{{{}segment.bytes{}}}) is currently defined as {{{}ConfigDef.Type.INT{}}}, 
capping the maximum segment size at {{Integer.MAX_VALUE}} (2,147,483,647 bytes, 
~2 GB). Additionally, the {{.index}} file format stores physical file positions 
as 4-byte signed integers, which also cannot address beyond ~2 GB.

With modern storage hardware (multi-TB NVMe drives) and high-throughput 
workloads, the 2 GB cap is increasingly a problem:
 * {*}Excessive file handle usage{*}: Each segment needs 4 files ({{{}.log{}}}, 
{{{}.index{}}}, {{{}.timeindex{}}}, {{{}.txnindex{}}}). A 10 TB partition with 
2 GB segments means ~20,000 open files.
 * {*}Frequent segment rolls{*}: A topic ingesting 500 MB/s rolls a new segment 
every ~4 seconds, amplifying index build, flush, and cleaner overhead.
 * {*}More log cleaning / compaction work{*}: More segments means more 
compaction cycles with more small groups.
 * {*}Remote storage overhead{*}: Each segment is an individual unit for tiered 
storage copy/delete operations.

Allowing segments of 4 GB, 8 GB, or larger would significantly reduce these 
overheads for high-throughput, large-retention workloads.


> Support log segments larger than 2 GB
> -------------------------------------
>
>                 Key: KAFKA-20552
>                 URL: https://issues.apache.org/jira/browse/KAFKA-20552
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>            Reporter: Haifeng Chen
>            Priority: Major
>
> The {{log.segment.bytes}} broker config (and its topic-level synonym 
> {{{}segment.bytes{}}}) is currently defined as {{{}ConfigDef.Type.INT{}}}, 
> capping the maximum segment size at {{Integer.MAX_VALUE}} (2,147,483,647 
> bytes, ~2 GB). Additionally, the {{.index}} file format stores physical file 
> positions as 4-byte signed integers, which also cannot address beyond ~2 GB.
> With modern storage hardware (multi-TB NVMe drives) and high-throughput 
> workloads, the 2 GB cap is increasingly a problem:
>  * {*}Excessive file handle usage{*}: Each segment needs 4 files 
> ({{{}.log{}}}, {{{}.index{}}}, {{{}.timeindex{}}}, {{{}.txnindex{}}}). A 10 
> TB partition with 2 GB segments means ~20,000 open files.
>  * {*}Frequent segment rolls{*}: A topic ingesting 500 MB/s rolls a new 
> segment every ~4 seconds, amplifying index build, flush, and cleaner overhead.
>  * {*}More log cleaning / compaction work{*}: More segments means more 
> compaction cycles with more small groups.
>  * {*}Remote storage overhead{*}: Each segment is an individual unit for 
> tiered storage copy/delete operations.
> Allowing segments of 4 GB, 8 GB, or larger would significantly reduce these 
> overheads for high-throughput, large-retention workloads.
> KIP 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1333%3A+Support+log+segments+larger+than+2+GB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to