[jira] [Commented] (KAFKA-6175) AbstractIndex should cache index file to avoid unnecessary disk access during resize()

2018-03-06 Thread Ivan Babrou (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388713#comment-16388713
 ] 

Ivan Babrou commented on KAFKA-6175:


Github doesn't report this as included in 1.0.1:

* 
https://github.com/apache/kafka/commit/12af521c487a146456442f895b9fc99a45ed100f

> AbstractIndex should cache index file to avoid unnecessary disk access during 
> resize()
> --
>
> Key: KAFKA-6175
> URL: https://issues.apache.org/jira/browse/KAFKA-6175
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
>Priority: Major
> Fix For: 1.0.1
>
>
> Currently when we shutdown a broker, we will call AbstractIndex.resize() for 
> all segments on the broker, regardless of whether the log segment is active 
> or not. AbstractIndex.resize() incurs raf.setLength(), which is expensive 
> because it accesses disks. If we do a threaddump during either 
> LogManger.shutdown() or LogManager.loadLogs(), most threads are in RUNNABLE 
> state at java.io.RandomAccessFile.setLength().
> This patch intends to speed up broker startup and shutdown time by skipping 
> AbstractIndex.resize() for inactive log segments.
> Here is the time of LogManager.shutdown() in various settings. In all these 
> tests, broker has roughly 6k partitions and 19k segments.
> - If broker does not have this patch and KAFKA-6172, LogManager.shutdown() 
> takes 69 seconds
> - If broker has KAFKA-6172 but not this patch, LogManager.shutdown() takes 21 
> seconds.
> - If broker has KAFKA-6172 and this patch, LogManager.shutdown() takes 1.6 
> seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6175) AbstractIndex should cache index file to avoid unnecessary disk access during resize()

2017-11-06 Thread James Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241010#comment-16241010
 ] 

James Cheng commented on KAFKA-6175:


[~lindong] Wow. Great work. Looking forward to this.


> AbstractIndex should cache index file to avoid unnecessary disk access during 
> resize()
> --
>
> Key: KAFKA-6175
> URL: https://issues.apache.org/jira/browse/KAFKA-6175
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
> Fix For: 1.0.1
>
>
> Currently when we shutdown a broker, we will call AbstractIndex.resize() for 
> all segments on the broker, regardless of whether the log segment is active 
> or not. AbstractIndex.resize() incurs raf.setLength(), which is expensive 
> because it accesses disks. If we do a threaddump during either 
> LogManger.shutdown() or LogManager.loadLogs(), most threads are in RUNNABLE 
> state at java.io.RandomAccessFile.setLength().
> This patch intends to speed up broker startup and shutdown time by skipping 
> AbstractIndex.resize() for inactive log segments.
> Here is the time of LogManager.shutdown() in various settings. In all these 
> tests, broker has roughly 6k partitions and 19k segments.
> - If broker does not have this patch and KAFKA-6172, LogManager.shutdown() 
> takes 69 seconds
> - If broker has KAFKA-6172 but not this patch, LogManager.shutdown() takes 21 
> seconds.
> - If broker has KAFKA-6172 and this patch, LogManager.shutdown() takes 1.6 
> seconds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6175) AbstractIndex should cache index file to avoid unnecessary disk access during resize()

2017-11-06 Thread Dong Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240672#comment-16240672
 ] 

Dong Lin commented on KAFKA-6175:
-

[~wushujames] I have updated the JIRA with the performance number.

> AbstractIndex should cache index file to avoid unnecessary disk access during 
> resize()
> --
>
> Key: KAFKA-6175
> URL: https://issues.apache.org/jira/browse/KAFKA-6175
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
> Fix For: 1.0.1
>
>
> Currently when we shutdown a broker, we will call AbstractIndex.resize() for 
> all segments on the broker, regardless of whether the log segment is active 
> or not. AbstractIndex.resize() incurs raf.setLength(), which is expensive 
> because it accesses disks. If we do a threaddump during either 
> LogManger.shutdown() or LogManager.loadLogs(), most threads are in RUNNABLE 
> state at java.io.RandomAccessFile.setLength().
> This patch intends to speed up broker startup and shutdown time by skipping 
> AbstractIndex.resize() for inactive log segments.
> Here is the time of LogManager.shutdown() in various settings. In all these 
> tests, broker has roughly 6k partitions and 19k segments.
> - If broker does not have this patch and KAFKA-6172, LogManager.shutdown() 
> takes 69 seconds
> - If broker has KAFKA-6172 but not this patch, LogManager.shutdown() takes 21 
> seconds.
> - If broker has KAFKA-6172 and this patch, LogManager.shutdown() takes 1.6 
> seconds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6175) AbstractIndex should cache index file to avoid unnecessary disk access during resize()

2017-11-05 Thread Dong Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239900#comment-16239900
 ] 

Dong Lin commented on KAFKA-6175:
-

[~wushujames] I don't have number yet. I will provide numbers later.

> AbstractIndex should cache index file to avoid unnecessary disk access during 
> resize()
> --
>
> Key: KAFKA-6175
> URL: https://issues.apache.org/jira/browse/KAFKA-6175
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
> Fix For: 1.0.1
>
>
> Currently when we shutdown a broker, we will call AbstractIndex.resize() for 
> all segments on the broker, regardless of whether the log segment is active 
> or not. AbstractIndex.resize() incurs raf.setLength(), which is expensive 
> because it accesses disks. If we do a threaddump during either 
> LogManger.shutdown() or LogManager.loadLogs(), most threads are in RUNNABLE 
> state at java.io.RandomAccessFile.setLength().
> This patch intends to speed up broker startup and shutdown time by skipping 
> AbstractIndex.resize() for inactive log segments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6175) AbstractIndex should cache index file to avoid unnecessary disk access during resize()

2017-11-05 Thread James Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239899#comment-16239899
 ] 

James Cheng commented on KAFKA-6175:


Do you have any estimates of how much time is saved, similar to your benchmarks 
in https://issues.apache.org/jira/browse/KAFKA-6172?

> AbstractIndex should cache index file to avoid unnecessary disk access during 
> resize()
> --
>
> Key: KAFKA-6175
> URL: https://issues.apache.org/jira/browse/KAFKA-6175
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
> Fix For: 1.0.1
>
>
> Currently when we shutdown a broker, we will call AbstractIndex.resize() for 
> all segments on the broker, regardless of whether the log segment is active 
> or not. AbstractIndex.resize() incurs raf.setLength(), which is expensive 
> because it accesses disks. If we do a threaddump during either 
> LogManger.shutdown() or LogManager.loadLogs(), most threads are in RUNNABLE 
> state at java.io.RandomAccessFile.setLength().
> This patch intends to speed up broker startup and shutdown time by skipping 
> AbstractIndex.resize() for inactive log segments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6175) AbstractIndex should cache index file to avoid unnecessary disk access during resize()

2017-11-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239734#comment-16239734
 ] 

ASF GitHub Bot commented on KAFKA-6175:
---

GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/4179

KAFKA-6175; AbstractIndex should cache index file to avoid unnecessary disk 
access during resize()



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-6175

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4179.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4179


commit c260ee2e7973bc4c9f3ec36d2215fc069b7619dc
Author: Dong Lin 
Date:   2017-11-05T21:40:21Z

KAFKA-6175; AbstractIndex should cache index file to avoid unnecessary disk 
access during resize()




> AbstractIndex should cache index file to avoid unnecessary disk access during 
> resize()
> --
>
> Key: KAFKA-6175
> URL: https://issues.apache.org/jira/browse/KAFKA-6175
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
>
> Currently when we shutdown a broker, we will call AbstractIndex.resize() for 
> all segments on the broker, regardless of whether the log segment is active 
> or not. AbstractIndex.resize() incurs raf.setLength(), which is expensive 
> because it accesses disks. If we do a threaddump during either 
> LogManger.shutdown() or LogManager.loadLogs(), most threads are in RUNNABLE 
> state at java.io.RandomAccessFile.setLength().
> This patch intends to speed up broker startup and shutdown time by skipping 
> AbstractIndex.resize() for inactive log segments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)