[ 
https://issues.apache.org/jira/browse/KAFKA-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171293#comment-15171293
 ] 

ASF GitHub Bot commented on KAFKA-3300:
---------------------------------------

GitHub user becketqin opened a pull request:

    https://github.com/apache/kafka/pull/983

    KAFKA-3300: Avoid over allocating disk space and memory for index files.

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/becketqin/kafka KAFKA-3300

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/983.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #983
    
----
commit b49a9af4c19513e458ced92ef49504f7a1c237df
Author: Jiangjie Qin <becket....@gmail.com>
Date:   2016-02-29T01:39:18Z

    KAFKA-3300: Avoid over allocating disk space and memory for index files.

----


> Calculate the initial/max size of offset index files and reduce the memory 
> footprint for memory mapped index files.
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-3300
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3300
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions: 0.9.0.1
>            Reporter: Jiangjie Qin
>            Assignee: Jiangjie Qin
>             Fix For: 0.10.0.0
>
>
> Currently the initial/max size of offset index file is configured by 
> {{log.index.max.bytes}}. This will be the offset index file size for active 
> log segment until it rolls out. 
> Theoretically, we can calculate the upper bound of offset index size using 
> the following formula:
> {noformat}
> log.segment.bytes / index.interval.bytes * 8
> {noformat}
> With default setting the bytes needed for an offset index size is 1GB / 4K * 
> 8 = 2MB. And the default log.index.max.bytes is 10MB.
> This means we are over-allocating at least 8MB on disk and mapping it to 
> memory.
> We can probably do the following:
> 1. When creating a new offset index, calculate the size using the above 
> formula,
> 2. If the result in (1) is greater than log.index.max.bytes, we allocate 
> log.index.max.bytes instead.
> This should be able to significantly save memory if a broker has a lot of 
> partitions on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to