[jira] [Commented] (KAFKA-6172) Cache lastEntry in TimeIndex to avoid unnecessary disk access
[ https://issues.apache.org/jira/browse/KAFKA-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388714#comment-16388714 ] Ivan Babrou commented on KAFKA-6172: Github doesn't report this as included in 1.0.1: * https://github.com/apache/kafka/commit/0c895706e8ab511efe352a824a0c9e2dab62499e > Cache lastEntry in TimeIndex to avoid unnecessary disk access > - > > Key: KAFKA-6172 > URL: https://issues.apache.org/jira/browse/KAFKA-6172 > Project: Kafka > Issue Type: Improvement >Reporter: Dong Lin >Assignee: Dong Lin >Priority: Major > Fix For: 1.0.1 > > > LogSegment.close() calls timeIndex.maybeAppend(...), which in turns make a > number of calls to timeIndex.lastEntry(). Currently timeIndex.lastEntry() > involves disk seek operation because it tries to read the content of the last > few bytes of the index files on the disk. This slows down the broker shutdown > process. > Here is the time of LogManager.shutdown() in various settings. In all these > tests, broker has roughly 6k partitions and 20k segments. > - If broker does not have this patch and `log.dirs` is configured with 1 JBOD > log directory, LogManager.shutdown() takes 15 minutes (roughly 900 seconds). > - If broker does not have this patch and `log.dirs` is configured with 10 > JBOD log directories, LogManager.shutdown() takes 84 seconds. > - If broker have this patch and `log.dirs` is configured with 10 JBOD log > directories, LogManager.shutdown() takes 24 seconds. > Thus we expect to save 71% time in LogManager.shutdown() by having this > optimization. This patch intends to reduce the broker shutdown time by > caching the lastEntry in memory so that broker does not have to always read > disk to get the lastEntry. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6172) Cache lastEntry in TimeIndex to avoid unnecessary disk access
[ https://issues.apache.org/jira/browse/KAFKA-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240586#comment-16240586 ] ASF GitHub Bot commented on KAFKA-6172: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/4177 > Cache lastEntry in TimeIndex to avoid unnecessary disk access > - > > Key: KAFKA-6172 > URL: https://issues.apache.org/jira/browse/KAFKA-6172 > Project: Kafka > Issue Type: Improvement >Reporter: Dong Lin >Assignee: Dong Lin > Fix For: 1.0.1 > > > LogSegment.close() calls timeIndex.maybeAppend(...), which in turns make a > number of calls to timeIndex.lastEntry(). Currently timeIndex.lastEntry() > involves disk seek operation because it tries to read the content of the last > few bytes of the index files on the disk. This slows down the broker shutdown > process. > Here is the time of LogManager.shutdown() in various settings. In all these > tests, broker has roughly 6k partitions and 20k segments. > - If broker does not have this patch and `log.dirs` is configured with 1 JBOD > log directory, LogManager.shutdown() takes 15 minutes (roughly 900 seconds). > - If broker does not have this patch and `log.dirs` is configured with 10 > JBOD log directories, LogManager.shutdown() takes 84 seconds. > - If broker have this patch and `log.dirs` is configured with 10 JBOD log > directories, LogManager.shutdown() takes 24 seconds. > Thus we expect to save 71% time in LogManager.shutdown() by having this > optimization. This patch intends to reduce the broker shutdown time by > caching the lastEntry in memory so that broker does not have to always read > disk to get the lastEntry. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6172) Cache lastEntry in TimeIndex to avoid unnecessary disk access
[ https://issues.apache.org/jira/browse/KAFKA-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239428#comment-16239428 ] ASF GitHub Bot commented on KAFKA-6172: --- GitHub user lindong28 opened a pull request: https://github.com/apache/kafka/pull/4177 KAFKA-6172; Cache lastEntry in TimeIndex to avoid unnecessary disk access You can merge this pull request into a Git repository by running: $ git pull https://github.com/lindong28/kafka KAFKA-6172 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/4177.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4177 commit 6a413ce9f233c3554450e006da885e8435e56502 Author: Dong Lin Date: 2017-11-05T07:20:35Z KAFKA-6172; Cache lastEntry in TimeIndex to avoid unnecessary disk access > Cache lastEntry in TimeIndex to avoid unnecessary disk access > - > > Key: KAFKA-6172 > URL: https://issues.apache.org/jira/browse/KAFKA-6172 > Project: Kafka > Issue Type: Improvement >Reporter: Dong Lin >Assignee: Dong Lin > > LogSegment.close() calls timeIndex.maybeAppend(...), which in turns make a > number of calls to timeIndex.lastEntry(). Currently timeIndex.lastEntry() > involves disk seek operation because it tries to read the content of the last > few bytes of the index files on the disk. This slows down the broker shutdown > process. > For a given broker of 6k partitions and 19k segments, we find that > LogManager.shutdown() takes 15 minutes. The broker is configured to use 10 > threads to close log in parallel. According to the thread dump taken while > the broker is in the process of LogManager.shutdown(), roughly 5 out of the > 10 threads are in RUNNABLE state at TimeIndex.lastEntry(). This suggests that > TimeIndex.lastEntry() is very likely costing a lot of shutdown time. > This patch intends to reduce the broker shutdown time by caching the > lastEntry in memory so that broker does not have to always read disk to get > the lastEntry. -- This message was sent by Atlassian JIRA (v6.4.14#64029)