[jira] [Commented] (KAFKA-6175) AbstractIndex should cache index file to avoid unnecessary disk access during resize()
[ https://issues.apache.org/jira/browse/KAFKA-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388713#comment-16388713 ] Ivan Babrou commented on KAFKA-6175: Github doesn't report this as included in 1.0.1: * https://github.com/apache/kafka/commit/12af521c487a146456442f895b9fc99a45ed100f > AbstractIndex should cache index file to avoid unnecessary disk access during > resize() > -- > > Key: KAFKA-6175 > URL: https://issues.apache.org/jira/browse/KAFKA-6175 > Project: Kafka > Issue Type: Improvement >Reporter: Dong Lin >Assignee: Dong Lin >Priority: Major > Fix For: 1.0.1 > > > Currently when we shutdown a broker, we will call AbstractIndex.resize() for > all segments on the broker, regardless of whether the log segment is active > or not. AbstractIndex.resize() incurs raf.setLength(), which is expensive > because it accesses disks. If we do a threaddump during either > LogManger.shutdown() or LogManager.loadLogs(), most threads are in RUNNABLE > state at java.io.RandomAccessFile.setLength(). > This patch intends to speed up broker startup and shutdown time by skipping > AbstractIndex.resize() for inactive log segments. > Here is the time of LogManager.shutdown() in various settings. In all these > tests, broker has roughly 6k partitions and 19k segments. > - If broker does not have this patch and KAFKA-6172, LogManager.shutdown() > takes 69 seconds > - If broker has KAFKA-6172 but not this patch, LogManager.shutdown() takes 21 > seconds. > - If broker has KAFKA-6172 and this patch, LogManager.shutdown() takes 1.6 > seconds. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6175) AbstractIndex should cache index file to avoid unnecessary disk access during resize()
[ https://issues.apache.org/jira/browse/KAFKA-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241010#comment-16241010 ] James Cheng commented on KAFKA-6175: [~lindong] Wow. Great work. Looking forward to this. > AbstractIndex should cache index file to avoid unnecessary disk access during > resize() > -- > > Key: KAFKA-6175 > URL: https://issues.apache.org/jira/browse/KAFKA-6175 > Project: Kafka > Issue Type: Improvement >Reporter: Dong Lin >Assignee: Dong Lin > Fix For: 1.0.1 > > > Currently when we shutdown a broker, we will call AbstractIndex.resize() for > all segments on the broker, regardless of whether the log segment is active > or not. AbstractIndex.resize() incurs raf.setLength(), which is expensive > because it accesses disks. If we do a threaddump during either > LogManger.shutdown() or LogManager.loadLogs(), most threads are in RUNNABLE > state at java.io.RandomAccessFile.setLength(). > This patch intends to speed up broker startup and shutdown time by skipping > AbstractIndex.resize() for inactive log segments. > Here is the time of LogManager.shutdown() in various settings. In all these > tests, broker has roughly 6k partitions and 19k segments. > - If broker does not have this patch and KAFKA-6172, LogManager.shutdown() > takes 69 seconds > - If broker has KAFKA-6172 but not this patch, LogManager.shutdown() takes 21 > seconds. > - If broker has KAFKA-6172 and this patch, LogManager.shutdown() takes 1.6 > seconds. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6175) AbstractIndex should cache index file to avoid unnecessary disk access during resize()
[ https://issues.apache.org/jira/browse/KAFKA-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240672#comment-16240672 ] Dong Lin commented on KAFKA-6175: - [~wushujames] I have updated the JIRA with the performance number. > AbstractIndex should cache index file to avoid unnecessary disk access during > resize() > -- > > Key: KAFKA-6175 > URL: https://issues.apache.org/jira/browse/KAFKA-6175 > Project: Kafka > Issue Type: Improvement >Reporter: Dong Lin >Assignee: Dong Lin > Fix For: 1.0.1 > > > Currently when we shutdown a broker, we will call AbstractIndex.resize() for > all segments on the broker, regardless of whether the log segment is active > or not. AbstractIndex.resize() incurs raf.setLength(), which is expensive > because it accesses disks. If we do a threaddump during either > LogManger.shutdown() or LogManager.loadLogs(), most threads are in RUNNABLE > state at java.io.RandomAccessFile.setLength(). > This patch intends to speed up broker startup and shutdown time by skipping > AbstractIndex.resize() for inactive log segments. > Here is the time of LogManager.shutdown() in various settings. In all these > tests, broker has roughly 6k partitions and 19k segments. > - If broker does not have this patch and KAFKA-6172, LogManager.shutdown() > takes 69 seconds > - If broker has KAFKA-6172 but not this patch, LogManager.shutdown() takes 21 > seconds. > - If broker has KAFKA-6172 and this patch, LogManager.shutdown() takes 1.6 > seconds. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6175) AbstractIndex should cache index file to avoid unnecessary disk access during resize()
[ https://issues.apache.org/jira/browse/KAFKA-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239900#comment-16239900 ] Dong Lin commented on KAFKA-6175: - [~wushujames] I don't have number yet. I will provide numbers later. > AbstractIndex should cache index file to avoid unnecessary disk access during > resize() > -- > > Key: KAFKA-6175 > URL: https://issues.apache.org/jira/browse/KAFKA-6175 > Project: Kafka > Issue Type: Improvement >Reporter: Dong Lin >Assignee: Dong Lin > Fix For: 1.0.1 > > > Currently when we shutdown a broker, we will call AbstractIndex.resize() for > all segments on the broker, regardless of whether the log segment is active > or not. AbstractIndex.resize() incurs raf.setLength(), which is expensive > because it accesses disks. If we do a threaddump during either > LogManger.shutdown() or LogManager.loadLogs(), most threads are in RUNNABLE > state at java.io.RandomAccessFile.setLength(). > This patch intends to speed up broker startup and shutdown time by skipping > AbstractIndex.resize() for inactive log segments. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6175) AbstractIndex should cache index file to avoid unnecessary disk access during resize()
[ https://issues.apache.org/jira/browse/KAFKA-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239899#comment-16239899 ] James Cheng commented on KAFKA-6175: Do you have any estimates of how much time is saved, similar to your benchmarks in https://issues.apache.org/jira/browse/KAFKA-6172? > AbstractIndex should cache index file to avoid unnecessary disk access during > resize() > -- > > Key: KAFKA-6175 > URL: https://issues.apache.org/jira/browse/KAFKA-6175 > Project: Kafka > Issue Type: Improvement >Reporter: Dong Lin >Assignee: Dong Lin > Fix For: 1.0.1 > > > Currently when we shutdown a broker, we will call AbstractIndex.resize() for > all segments on the broker, regardless of whether the log segment is active > or not. AbstractIndex.resize() incurs raf.setLength(), which is expensive > because it accesses disks. If we do a threaddump during either > LogManger.shutdown() or LogManager.loadLogs(), most threads are in RUNNABLE > state at java.io.RandomAccessFile.setLength(). > This patch intends to speed up broker startup and shutdown time by skipping > AbstractIndex.resize() for inactive log segments. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6175) AbstractIndex should cache index file to avoid unnecessary disk access during resize()
[ https://issues.apache.org/jira/browse/KAFKA-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239734#comment-16239734 ] ASF GitHub Bot commented on KAFKA-6175: --- GitHub user lindong28 opened a pull request: https://github.com/apache/kafka/pull/4179 KAFKA-6175; AbstractIndex should cache index file to avoid unnecessary disk access during resize() You can merge this pull request into a Git repository by running: $ git pull https://github.com/lindong28/kafka KAFKA-6175 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/4179.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4179 commit c260ee2e7973bc4c9f3ec36d2215fc069b7619dc Author: Dong LinDate: 2017-11-05T21:40:21Z KAFKA-6175; AbstractIndex should cache index file to avoid unnecessary disk access during resize() > AbstractIndex should cache index file to avoid unnecessary disk access during > resize() > -- > > Key: KAFKA-6175 > URL: https://issues.apache.org/jira/browse/KAFKA-6175 > Project: Kafka > Issue Type: Improvement >Reporter: Dong Lin >Assignee: Dong Lin > > Currently when we shutdown a broker, we will call AbstractIndex.resize() for > all segments on the broker, regardless of whether the log segment is active > or not. AbstractIndex.resize() incurs raf.setLength(), which is expensive > because it accesses disks. If we do a threaddump during either > LogManger.shutdown() or LogManager.loadLogs(), most threads are in RUNNABLE > state at java.io.RandomAccessFile.setLength(). > This patch intends to speed up broker startup and shutdown time by skipping > AbstractIndex.resize() for inactive log segments. -- This message was sent by Atlassian JIRA (v6.4.14#64029)