[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

Honghai Chen (JIRA) Thu, 05 Mar 2015 15:37:06 -0800

    [ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349621#comment-14349621
 ]


Honghai Chen edited comment on KAFKA-1646 at 3/5/15 11:35 PM:
--------------------------------------------------------------

Or do you prefer the second option:
Add one more column to file "recovery-point-offset-checkpoint", currently it 
only record offset, like below:
 0
 2
 mvlogs 1 100
 mvlogs 0 200
 Change to below by add one column "recoverposition"
 0
 2
 mvlogs 1 100 8000
 mvlogs 0 200 16000

8000 is the start position of the data file for message with offset 100 . And 
16000 is start position of the data file for message with offset 200.
 Take first one as example, what we need do are:
1, keep offset and position consistent  and regularly write to file 
"recovery-point-offset-checkpoint", 
2, when in clean shutdown,  truncate the file to the "recoverposition".
3, when start, find the log segment related with the recover point, truncate 
the file to the "recoverposition"
4, when start, if the os is windows, add one new segment.
But this change is big, since so many places are using variable recoveryPoint.

Which one do you recommend?  Really appreciate for your guide. 
[~jkreps][~nehanarkhede][~junrao] 




was (Author: waldenchen):
Or do you prefer the second option:
Add one more column to file "recovery-point-offset-checkpoint", currently it 
only record offset, like below:
 0
 2
 mvlogs 1 100
 mvlogs 0 200
 Change to below:
 0
 2
 mvlogs 1 100 8000
 mvlogs 0 200 16000

8000 is the start position of the data file for message with offset 100 . And 
16000 is start position of the data file for message with offset 200.
 Take first one as example, what we need do are:
1, keep offset and position consistent  and regularly write to file 
"recovery-point-offset-checkpoint", 
2, when in clean shutdown,  truncate the file to the size.
3, when start, if the os is windows, add one new segment.
But this change is big, since so many places are using variable recoveryPoint.

Which one do you recommend?  Really appreciate for your guide. 
[~jkreps][~nehanarkhede][~junrao] 



> Improve consumer read performance for Windows
> ---------------------------------------------
>
>                 Key: KAFKA-1646
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1646
>             Project: Kafka
>          Issue Type: Improvement
>          Components: log
>    Affects Versions: 0.8.1.1
>         Environment: Windows
>            Reporter: xueqiang wang
>            Assignee: xueqiang wang
>              Labels: newbie, patch
>         Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

Reply via email to