[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593116#comment-14593116 ] Honghai Chen edited comment on KAFKA-1646 at 6/19/15 6:57 AM: -- Updated reviewboard https://reviews.apache.org/r/33204/diff/ against branch origin/trunk The latest code review :https://reviews.apache.org/r/33204/diff/8/ 1, fix 2 test cases. 2, fix logCleaner.Whenever create one new file, should set the preallocate parameter. 3, merge to latest trunk. Is it ok to go? [~junrao] Try push some code to open source even harder than play China Shanghai Stock A, hehe. was (Author: waldenchen): Updated reviewboard https://reviews.apache.org/r/33204/diff/ against branch origin/trunk > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, > KAFKA-1646_20150511_AddTestcases.patch, > KAFKA-1646_20150609_MergeToLatestTrunk.patch, > KAFKA-1646_20150616_FixFormat.patch, KAFKA-1646_20150618_235231.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587601#comment-14587601 ] Honghai Chen edited comment on KAFKA-1646 at 6/16/15 7:38 AM: -- Fix all issues mentioned by [~junrao] https://reviews.apache.org/r/33204/diff/5/ latest patch also attached, can we ship it now? was (Author: waldenchen): Created reviewboard https://reviews.apache.org/r/35493/diff/ against branch origin/trunk > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, > KAFKA-1646_20150511_AddTestcases.patch, > KAFKA-1646_20150609_MergeToLatestTrunk.patch, > KAFKA-1646_20150616_FixFormat.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578235#comment-14578235 ] Honghai Chen edited comment on KAFKA-1646 at 6/9/15 3:07 AM: - Merge to latest trunk, patch attached. code review no change https://reviews.apache.org/r/33204/diff/4/ was (Author: waldenchen): Created reviewboard against branch origin/trunk > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, > KAFKA-1646_20150511_AddTestcases.patch, > KAFKA-1646_20150609_MergeToLatestTrunk.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529694#comment-14529694 ] Honghai Chen edited comment on KAFKA-1646 at 5/6/15 1:13 AM: - When trying add test case for Log, got test failures for existing test cases in windows. https://issues.apache.org/jira/browse/KAFKA-2170 was (Author: waldenchen): When trying add test case for Log, got test failures for existing test cases in windows. > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646.patch, KAFKA-1646_20141216_163008.patch, > KAFKA-1646_20150306_005526.patch, KAFKA-1646_20150312_200352.patch, > KAFKA-1646_20150414_035415.patch, KAFKA-1646_20150414_184503.patch, > KAFKA-1646_20150422.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506767#comment-14506767 ] Honghai Chen edited comment on KAFKA-1646 at 4/22/15 10:15 AM: --- New code review board https://reviews.apache.org/r/33204/diff/2/ patch against trunk also attached. was (Author: waldenchen): Created reviewboard against branch origin/trunk > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646.patch, KAFKA-1646.patch, KAFKA-1646_20141216_163008.patch, > KAFKA-1646_20150306_005526.patch, KAFKA-1646_20150312_200352.patch, > KAFKA-1646_20150414_035415.patch, KAFKA-1646_20150414_184503.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495592#comment-14495592 ] Honghai Chen edited comment on KAFKA-1646 at 4/15/15 3:28 AM: -- [~jkreps] The patch against trunk attached. And the reviewboard against branch trunk is here https://reviews.apache.org/r/33204/diff/ was (Author: waldenchen): Created reviewboard against branch origin/trunk > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646.patch, KAFKA-1646_20141216_163008.patch, > KAFKA-1646_20150306_005526.patch, KAFKA-1646_20150312_200352.patch, > KAFKA-1646_20150414_035415.patch, KAFKA-1646_20150414_184503.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495541#comment-14495541 ] Honghai Chen edited comment on KAFKA-1646 at 4/15/15 1:49 AM: -- Update code do not do flush when trim. Only do flush when close. Update the the review against 0.8.1 (https://reviews.apache.org/r/29091/diff/11/), will add one new review/patch against trunk soon. was (Author: waldenchen): Updated reviewboard https://reviews.apache.org/r/29091/diff/ against branch origin/0.8.1 > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, > KAFKA-1646_20150312_200352.patch, KAFKA-1646_20150414_035415.patch, > KAFKA-1646_20150414_184503.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493899#comment-14493899 ] Honghai Chen edited comment on KAFKA-1646 at 4/14/15 10:59 AM: --- [~jkreps] All issues have been addressed, please check updated reviewboard https://reviews.apache.org/r/29091/diff/10/ against branch origin/0.8.1 Many thanks for your guidance, really appreciate. was (Author: waldenchen): Updated reviewboard https://reviews.apache.org/r/29091/diff/ against branch origin/0.8.1 > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, > KAFKA-1646_20150312_200352.patch, KAFKA-1646_20150414_035415.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358096#comment-14358096 ] Honghai Chen edited comment on KAFKA-1646 at 3/12/15 4:48 AM: -- Hey, [~jkreps] Would you like help check the review at https://reviews.apache.org/r/29091/diff/7/ , really appreciate, thanks. was (Author: waldenchen): Het, [~jkreps] Would you like help check the review at https://reviews.apache.org/r/29091/diff/7/ , really appreciate, thanks. > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350126#comment-14350126 ] Honghai Chen edited comment on KAFKA-1646 at 3/6/15 9:02 AM: - Updated reviewboard against branch origin/0.8.1 Hi, [~jkreps] [~junrao] [~jghoman] please check the review at https://reviews.apache.org/r/29091/diff/7/ , appreciate. was (Author: waldenchen): Updated reviewboard against branch origin/0.8.1 Hi, [~jkreps] [~junrao] [~jghoman]] please check the review at https://reviews.apache.org/r/29091/diff/7/ , appreciate. > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350126#comment-14350126 ] Honghai Chen edited comment on KAFKA-1646 at 3/6/15 9:01 AM: - Updated reviewboard against branch origin/0.8.1 Hi, [~jkreps] [~junrao] [~jghoman]] please check the review at https://reviews.apache.org/r/29091/diff/7/ , appreciate. was (Author: waldenchen): Updated reviewboard against branch origin/0.8.1 Please check the review athttps://reviews.apache.org/r/29091/diff/7/ [~jkreps] > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350126#comment-14350126 ] Honghai Chen edited comment on KAFKA-1646 at 3/6/15 8:59 AM: - Updated reviewboard against branch origin/0.8.1 Please check the review athttps://reviews.apache.org/r/29091/diff/7/ [~jkreps] was (Author: waldenchen): Updated reviewboard against branch origin/0.8.1 > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349621#comment-14349621 ] Honghai Chen edited comment on KAFKA-1646 at 3/5/15 11:35 PM: -- Or do you prefer the second option: Add one more column to file "recovery-point-offset-checkpoint", currently it only record offset, like below: 0 2 mvlogs 1 100 mvlogs 0 200 Change to below by add one column "recoverposition" 0 2 mvlogs 1 100 8000 mvlogs 0 200 16000 8000 is the start position of the data file for message with offset 100 . And 16000 is start position of the data file for message with offset 200. Take first one as example, what we need do are: 1, keep offset and position consistent and regularly write to file "recovery-point-offset-checkpoint", 2, when in clean shutdown, truncate the file to the "recoverposition". 3, when start, find the log segment related with the recover point, truncate the file to the "recoverposition" 4, when start, if the os is windows, add one new segment. But this change is big, since so many places are using variable recoveryPoint. Which one do you recommend? Really appreciate for your guide. [~jkreps][~nehanarkhede][~junrao] was (Author: waldenchen): Or do you prefer the second option: Add one more column to file "recovery-point-offset-checkpoint", currently it only record offset, like below: 0 2 mvlogs 1 100 mvlogs 0 200 Change to below: 0 2 mvlogs 1 100 8000 mvlogs 0 200 16000 8000 is the start position of the data file for message with offset 100 . And 16000 is start position of the data file for message with offset 200. Take first one as example, what we need do are: 1, keep offset and position consistent and regularly write to file "recovery-point-offset-checkpoint", 2, when in clean shutdown, truncate the file to the size. 3, when start, if the os is windows, add one new segment. But this change is big, since so many places are using variable recoveryPoint. Which one do you recommend? Really appreciate for your guide. [~jkreps][~nehanarkhede][~junrao] > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14348401#comment-14348401 ] Honghai Chen edited comment on KAFKA-1646 at 3/5/15 10:44 AM: -- Hey, [~jkreps] just clarify, the 50MB/s you mentioned before is the checksum calculation on the machine, not copy replica data from other machine, right? If that's true, seemly we need do 3 changes: 1, when call logManager.shutdown. and os is windows , truncate active segment. 2, when start, if the os is windows, add one new segment. 3, remove the change " KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch " made previously since it's unnecessary. Make sense? was (Author: waldenchen): Actually we want to add one more column to file "recovery-point-offset-checkpoint", currently it only record offset, like below: 0 2 mvlogs 1 100 mvlogs 0 200 Change to below: 0 2 mvlogs 1 100 8000 mvlogs 0 200 16000 8000 is the start position of the data file for message with offset 100 . And 16000 is start position of the data file for message with offset 200. Take first one as example, when recover the last segment (in function LogSegment.recover(maxMessageSize: Int) , ONLY recover file to min(validBytes, 8000) with offset 100 and rebuild index. Make sense ? [~jkreps] > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14179731#comment-14179731 ] Jakob Homan edited comment on KAFKA-1646 at 2/4/15 8:36 PM: Hey, Jay, sorry for late. I have done a test and find there is no pause when rolling new log segment. The time for creating an 1G file and 1K file is almost identical: all about 1ms. Here is the test code which creating 10 files of 1G: {code} public static void main(String[] args) throws Exception { String filePre = "d:\\temp\\file"; long startTime, elapsedTime; startTime = System.currentTimeMillis(); try { long initFileSize = 102400l; for (int i = 0; i < 10; i++) { RandomAccessFile randomAccessFile = new RandomAccessFile(filePre + i, "rw"); randomAccessFile.setLength(initFileSize); randomAccessFile.getChannel(); } elapsedTime = System.currentTimeMillis() - startTime; System.out.format("elapsedTime: %2d ms", elapsedTime); } catch (Exception exception) { } } {code} The result is: elapsedTime: 14 ms was (Author: xueqiang): Hey, Jay, sorry for late. I have done a test and find there is no pause when rolling new log segment. The time for creating an 1G file and 1K file is almost identical: all about 1ms. Here is the test code which creating 10 files of 1G: public static void main(String[] args) throws Exception { String filePre = "d:\\temp\\file"; long startTime, elapsedTime; startTime = System.currentTimeMillis(); try { long initFileSize = 102400l; for (int i = 0; i < 10; i++) { RandomAccessFile randomAccessFile = new RandomAccessFile(filePre + i, "rw"); randomAccessFile.setLength(initFileSize); randomAccessFile.getChannel(); } elapsedTime = System.currentTimeMillis() - startTime; System.out.format("elapsedTime: %2d ms", elapsedTime); } catch (Exception exception) { } } The result is: elapsedTime: 14 ms > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231258#comment-14231258 ] Jakob Homan edited comment on KAFKA-1646 at 2/4/15 8:32 PM: Hi, Jun. I also came from same team with Xueqiang & Honghai. Just as Jay mentioned before, recover the log is also necessary when the broker is stopped gracefully. Since we just recover the activeSegment (only one LogSegment), so it would not cost a lot of time. And I just write some code to test the performance of LogSegment.recover(), {code} public class TestLogSegment { public static void main(String[] args) throws Exception { String dirTemplate = args.length > 0? args[0] : "D:\\C\\scp2\\tachyon\\logs\\testbroker\\mvlogs-"; int logFileNums = args.length > 1? Integer.parseInt(args[1]) : 10; long startOffset = 0L; int indexIntervalBytes = 4096; int maxIndexSize = 10485760; long initFileSize = 536870912L; int maxMessageSize = 100; long totalTime = 0L; for (int i=0; i 0? args[0] : "D:\\C\\scp2\\tachyon\\logs\\testbroker\\mvlogs-"; int logFileNums = args.length > 1? Integer.parseInt(args[1]) : 10; long startOffset = 0L; int indexIntervalBytes = 4096; int maxIndexSize = 10485760; long initFileSize = 536870912L; int maxMessageSize = 100; long totalTime = 0L; for (int i=0; i Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang >Assignee: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch, > KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, > KAFKA-1646_20141216_163008.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows
[ https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162106#comment-14162106 ] Jay Kreps edited comment on KAFKA-1646 at 10/7/14 6:23 PM: --- Ah, you are saying Windows does a worse job of preallocation? How much does this help? Did you do any benchmarking on the performance improvement? was (Author: jkreps): Ah, you are saying Windows does a worse job of preallocation? Can you do some benchmark the performance improvement? > Improve consumer read performance for Windows > - > > Key: KAFKA-1646 > URL: https://issues.apache.org/jira/browse/KAFKA-1646 > Project: Kafka > Issue Type: Improvement > Components: log >Affects Versions: 0.8.1.1 > Environment: Windows >Reporter: xueqiang wang > Labels: newbie, patch > Attachments: Improve consumer read performance for Windows.patch > > > This patch is for Window platform only. In Windows platform, if there are > more than one replicas writing to disk, the segment log files will not be > consistent in disk and then consumer reading performance will be dropped down > greatly. This fix allocates more disk spaces when rolling a new segment, and > then it will improve the consumer reading performance in NTFS file system. > This patch doesn't affect file allocation of other filesystems, for it only > adds statements like 'if(Os.iswindow)' or adds methods used on Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)