[jira] [Commented] (KAFKA-3300) Calculate the initial/max size of offset index files and reduce the memory footprint for memory mapped index files.

2017-01-25 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838069#comment-15838069
 ] 

Ismael Juma commented on KAFKA-3300:


This was deemed not necessary if I recall correctly. If that's right, shall we 
close this JIRA?

> Calculate the initial/max size of offset index files and reduce the memory 
> footprint for memory mapped index files.
> ---
>
> Key: KAFKA-3300
> URL: https://issues.apache.org/jira/browse/KAFKA-3300
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.1
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.10.2.0
>
>
> Currently the initial/max size of offset index file is configured by 
> {{log.index.max.bytes}}. This will be the offset index file size for active 
> log segment until it rolls out. 
> Theoretically, we can calculate the upper bound of offset index size using 
> the following formula:
> {noformat}
> log.segment.bytes / index.interval.bytes * 8
> {noformat}
> With default setting the bytes needed for an offset index size is 1GB / 4K * 
> 8 = 2MB. And the default log.index.max.bytes is 10MB.
> This means we are over-allocating at least 8MB on disk and mapping it to 
> memory.
> We can probably do the following:
> 1. When creating a new offset index, calculate the size using the above 
> formula,
> 2. If the result in (1) is greater than log.index.max.bytes, we allocate 
> log.index.max.bytes instead.
> This should be able to significantly save memory if a broker has a lot of 
> partitions on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3300) Calculate the initial/max size of offset index files and reduce the memory footprint for memory mapped index files.

2016-02-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171293#comment-15171293
 ] 

ASF GitHub Bot commented on KAFKA-3300:
---

GitHub user becketqin opened a pull request:

https://github.com/apache/kafka/pull/983

KAFKA-3300: Avoid over allocating disk space and memory for index files.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/becketqin/kafka KAFKA-3300

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/983.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #983


commit b49a9af4c19513e458ced92ef49504f7a1c237df
Author: Jiangjie Qin 
Date:   2016-02-29T01:39:18Z

KAFKA-3300: Avoid over allocating disk space and memory for index files.




> Calculate the initial/max size of offset index files and reduce the memory 
> footprint for memory mapped index files.
> ---
>
> Key: KAFKA-3300
> URL: https://issues.apache.org/jira/browse/KAFKA-3300
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.1
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.10.0.0
>
>
> Currently the initial/max size of offset index file is configured by 
> {{log.index.max.bytes}}. This will be the offset index file size for active 
> log segment until it rolls out. 
> Theoretically, we can calculate the upper bound of offset index size using 
> the following formula:
> {noformat}
> log.segment.bytes / index.interval.bytes * 8
> {noformat}
> With default setting the bytes needed for an offset index size is 1GB / 4K * 
> 8 = 2MB. And the default log.index.max.bytes is 10MB.
> This means we are over-allocating at least 8MB on disk and mapping it to 
> memory.
> We can probably do the following:
> 1. When creating a new offset index, calculate the size using the above 
> formula,
> 2. If the result in (1) is greater than log.index.max.bytes, we allocate 
> log.index.max.bytes instead.
> This should be able to significantly save memory if a broker has a lot of 
> partitions on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)