[jira] [Commented] (KAFKA-6172) Cache lastEntry in TimeIndex to avoid unnecessary disk access

2018-03-06 Thread Ivan Babrou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388714#comment-16388714
 ] 

Ivan Babrou commented on KAFKA-6172:


Github doesn't report this as included in 1.0.1:

* 
https://github.com/apache/kafka/commit/0c895706e8ab511efe352a824a0c9e2dab62499e

> Cache lastEntry in TimeIndex to avoid unnecessary disk access
> -
>
> Key: KAFKA-6172
> URL: https://issues.apache.org/jira/browse/KAFKA-6172
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
>Priority: Major
> Fix For: 1.0.1
>
>
> LogSegment.close() calls timeIndex.maybeAppend(...), which in turns make a 
> number of calls to timeIndex.lastEntry(). Currently timeIndex.lastEntry() 
> involves disk seek operation because it tries to read the content of the last 
> few bytes of the index files on the disk. This slows down the broker shutdown 
> process.
> Here is the time of LogManager.shutdown() in various settings. In all these 
> tests, broker has roughly 6k partitions and 20k segments.
> - If broker does not have this patch and `log.dirs` is configured with 1 JBOD 
> log directory, LogManager.shutdown() takes 15 minutes (roughly 900 seconds).
> - If broker does not have this patch and `log.dirs` is configured with 10 
> JBOD log directories, LogManager.shutdown() takes 84 seconds.
> - If broker have this patch and `log.dirs` is configured with 10 JBOD log 
> directories, LogManager.shutdown() takes 24 seconds.
> Thus we expect to save 71% time in LogManager.shutdown() by having this 
> optimization. This patch intends to reduce the broker shutdown time by 
> caching the lastEntry in memory so that broker does not have to always read 
> disk to get the lastEntry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6172) Cache lastEntry in TimeIndex to avoid unnecessary disk access

2017-11-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240586#comment-16240586
 ] 

ASF GitHub Bot commented on KAFKA-6172:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/4177


> Cache lastEntry in TimeIndex to avoid unnecessary disk access
> -
>
> Key: KAFKA-6172
> URL: https://issues.apache.org/jira/browse/KAFKA-6172
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
> Fix For: 1.0.1
>
>
> LogSegment.close() calls timeIndex.maybeAppend(...), which in turns make a 
> number of calls to timeIndex.lastEntry(). Currently timeIndex.lastEntry() 
> involves disk seek operation because it tries to read the content of the last 
> few bytes of the index files on the disk. This slows down the broker shutdown 
> process.
> Here is the time of LogManager.shutdown() in various settings. In all these 
> tests, broker has roughly 6k partitions and 20k segments.
> - If broker does not have this patch and `log.dirs` is configured with 1 JBOD 
> log directory, LogManager.shutdown() takes 15 minutes (roughly 900 seconds).
> - If broker does not have this patch and `log.dirs` is configured with 10 
> JBOD log directories, LogManager.shutdown() takes 84 seconds.
> - If broker have this patch and `log.dirs` is configured with 10 JBOD log 
> directories, LogManager.shutdown() takes 24 seconds.
> Thus we expect to save 71% time in LogManager.shutdown() by having this 
> optimization. This patch intends to reduce the broker shutdown time by 
> caching the lastEntry in memory so that broker does not have to always read 
> disk to get the lastEntry.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6172) Cache lastEntry in TimeIndex to avoid unnecessary disk access

2017-11-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239428#comment-16239428
 ] 

ASF GitHub Bot commented on KAFKA-6172:
---

GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/4177

KAFKA-6172; Cache lastEntry in TimeIndex to avoid unnecessary disk access



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-6172

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4177.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4177


commit 6a413ce9f233c3554450e006da885e8435e56502
Author: Dong Lin 
Date:   2017-11-05T07:20:35Z

KAFKA-6172; Cache lastEntry in TimeIndex to avoid unnecessary disk access




> Cache lastEntry in TimeIndex to avoid unnecessary disk access
> -
>
> Key: KAFKA-6172
> URL: https://issues.apache.org/jira/browse/KAFKA-6172
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
>
> LogSegment.close() calls timeIndex.maybeAppend(...), which in turns make a 
> number of calls to timeIndex.lastEntry(). Currently timeIndex.lastEntry() 
> involves disk seek operation because it tries to read the content of the last 
> few bytes of the index files on the disk. This slows down the broker shutdown 
> process.
> For a given broker of 6k partitions and 19k segments, we find that 
> LogManager.shutdown() takes 15 minutes. The broker is configured to use 10 
> threads to close log in parallel. According to the thread dump taken while 
> the broker is in the process of LogManager.shutdown(), roughly 5 out of the 
> 10 threads are in RUNNABLE state at TimeIndex.lastEntry(). This suggests that 
> TimeIndex.lastEntry() is very likely costing a lot of shutdown time.
> This patch intends to reduce the broker shutdown time by caching the 
> lastEntry in memory so that broker does not have to always read disk to get 
> the lastEntry.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)