[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-06-18 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593116#comment-14593116
 ] 

Honghai Chen edited comment on KAFKA-1646 at 6/19/15 6:57 AM:
--

Updated reviewboard https://reviews.apache.org/r/33204/diff/
 against branch origin/trunk

The latest code review :https://reviews.apache.org/r/33204/diff/8/
1, fix 2 test cases.
2, fix logCleaner.Whenever create one new file, should set the preallocate 
parameter.
3, merge to latest trunk.

Is it ok to go?  [~junrao]

Try push some code to open source even harder than play China Shanghai Stock A, 
hehe.




was (Author: waldenchen):
Updated reviewboard https://reviews.apache.org/r/33204/diff/
 against branch origin/trunk

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, 
> KAFKA-1646_20150511_AddTestcases.patch, 
> KAFKA-1646_20150609_MergeToLatestTrunk.patch, 
> KAFKA-1646_20150616_FixFormat.patch, KAFKA-1646_20150618_235231.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-06-16 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587601#comment-14587601
 ] 

Honghai Chen edited comment on KAFKA-1646 at 6/16/15 7:38 AM:
--

Fix all issues mentioned by [~junrao]  
https://reviews.apache.org/r/33204/diff/5/
latest patch also attached,  can we ship it now?


was (Author: waldenchen):
Created reviewboard https://reviews.apache.org/r/35493/diff/
 against branch origin/trunk

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, 
> KAFKA-1646_20150511_AddTestcases.patch, 
> KAFKA-1646_20150609_MergeToLatestTrunk.patch, 
> KAFKA-1646_20150616_FixFormat.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-06-08 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578235#comment-14578235
 ] 

Honghai Chen edited comment on KAFKA-1646 at 6/9/15 3:07 AM:
-

Merge to latest trunk, patch attached.

code review no change   https://reviews.apache.org/r/33204/diff/4/



was (Author: waldenchen):
Created reviewboard  against branch origin/trunk

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, 
> KAFKA-1646_20150511_AddTestcases.patch, 
> KAFKA-1646_20150609_MergeToLatestTrunk.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-05-05 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529694#comment-14529694
 ] 

Honghai Chen edited comment on KAFKA-1646 at 5/6/15 1:13 AM:
-

When trying add test case for Log, got test failures for existing test cases in 
windows.
https://issues.apache.org/jira/browse/KAFKA-2170


was (Author: waldenchen):
When trying add test case for Log, got test failures for existing test cases in 
windows.

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646.patch, KAFKA-1646_20141216_163008.patch, 
> KAFKA-1646_20150306_005526.patch, KAFKA-1646_20150312_200352.patch, 
> KAFKA-1646_20150414_035415.patch, KAFKA-1646_20150414_184503.patch, 
> KAFKA-1646_20150422.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-04-22 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506767#comment-14506767
 ] 

Honghai Chen edited comment on KAFKA-1646 at 4/22/15 10:15 AM:
---

New code review board https://reviews.apache.org/r/33204/diff/2/
patch against trunk also attached.


was (Author: waldenchen):
Created reviewboard  against branch origin/trunk

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646.patch, KAFKA-1646.patch, KAFKA-1646_20141216_163008.patch, 
> KAFKA-1646_20150306_005526.patch, KAFKA-1646_20150312_200352.patch, 
> KAFKA-1646_20150414_035415.patch, KAFKA-1646_20150414_184503.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-04-14 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495592#comment-14495592
 ] 

Honghai Chen edited comment on KAFKA-1646 at 4/15/15 3:28 AM:
--

[~jkreps] The patch against trunk attached.
And the reviewboard  against branch trunk is here 
https://reviews.apache.org/r/33204/diff/ 


was (Author: waldenchen):
Created reviewboard  against branch origin/trunk

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646.patch, KAFKA-1646_20141216_163008.patch, 
> KAFKA-1646_20150306_005526.patch, KAFKA-1646_20150312_200352.patch, 
> KAFKA-1646_20150414_035415.patch, KAFKA-1646_20150414_184503.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-04-14 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495541#comment-14495541
 ] 

Honghai Chen edited comment on KAFKA-1646 at 4/15/15 1:49 AM:
--

Update code do not do flush when trim.  Only do flush when close.   Update the 
the review against 0.8.1 (https://reviews.apache.org/r/29091/diff/11/), 

will add one new review/patch against trunk soon.


was (Author: waldenchen):
Updated reviewboard https://reviews.apache.org/r/29091/diff/
 against branch origin/0.8.1

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, 
> KAFKA-1646_20150312_200352.patch, KAFKA-1646_20150414_035415.patch, 
> KAFKA-1646_20150414_184503.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-04-14 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493899#comment-14493899
 ] 

Honghai Chen edited comment on KAFKA-1646 at 4/14/15 10:59 AM:
---

[~jkreps]  All issues have been addressed,  please check updated reviewboard  
https://reviews.apache.org/r/29091/diff/10/  against branch origin/0.8.1
Many thanks for your guidance, really appreciate.


was (Author: waldenchen):
Updated reviewboard https://reviews.apache.org/r/29091/diff/
 against branch origin/0.8.1

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch, 
> KAFKA-1646_20150312_200352.patch, KAFKA-1646_20150414_035415.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-03-11 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358096#comment-14358096
 ] 

Honghai Chen edited comment on KAFKA-1646 at 3/12/15 4:48 AM:
--

Hey, [~jkreps] Would you like help check the review at 
https://reviews.apache.org/r/29091/diff/7/  , really appreciate, thanks.


was (Author: waldenchen):
Het, [~jkreps] Would you like help check the review at 
https://reviews.apache.org/r/29091/diff/7/  , really appreciate, thanks.

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-03-06 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350126#comment-14350126
 ] 

Honghai Chen edited comment on KAFKA-1646 at 3/6/15 9:02 AM:
-

Updated reviewboard  against branch origin/0.8.1
Hi,  [~jkreps] [~junrao] [~jghoman] please check the review at
https://reviews.apache.org/r/29091/diff/7/  , appreciate.


was (Author: waldenchen):
Updated reviewboard  against branch origin/0.8.1
Hi,  [~jkreps] [~junrao] [~jghoman]] please check the review at
https://reviews.apache.org/r/29091/diff/7/  , appreciate.

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-03-06 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350126#comment-14350126
 ] 

Honghai Chen edited comment on KAFKA-1646 at 3/6/15 9:01 AM:
-

Updated reviewboard  against branch origin/0.8.1
Hi,  [~jkreps] [~junrao] [~jghoman]] please check the review at
https://reviews.apache.org/r/29091/diff/7/  , appreciate.


was (Author: waldenchen):
Updated reviewboard  against branch origin/0.8.1
Please check the review athttps://reviews.apache.org/r/29091/diff/7/  
[~jkreps]

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-03-06 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350126#comment-14350126
 ] 

Honghai Chen edited comment on KAFKA-1646 at 3/6/15 8:59 AM:
-

Updated reviewboard  against branch origin/0.8.1
Please check the review athttps://reviews.apache.org/r/29091/diff/7/  
[~jkreps]


was (Author: waldenchen):
Updated reviewboard  against branch origin/0.8.1

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch, KAFKA-1646_20150306_005526.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-03-05 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349621#comment-14349621
 ] 

Honghai Chen edited comment on KAFKA-1646 at 3/5/15 11:35 PM:
--

Or do you prefer the second option:
Add one more column to file "recovery-point-offset-checkpoint", currently it 
only record offset, like below:
 0
 2
 mvlogs 1 100
 mvlogs 0 200
 Change to below by add one column "recoverposition"
 0
 2
 mvlogs 1 100 8000
 mvlogs 0 200 16000

8000 is the start position of the data file for message with offset 100 . And 
16000 is start position of the data file for message with offset 200.
 Take first one as example, what we need do are:
1, keep offset and position consistent  and regularly write to file 
"recovery-point-offset-checkpoint", 
2, when in clean shutdown,  truncate the file to the "recoverposition".
3, when start, find the log segment related with the recover point, truncate 
the file to the "recoverposition"
4, when start, if the os is windows, add one new segment.
But this change is big, since so many places are using variable recoveryPoint.

Which one do you recommend?  Really appreciate for your guide. 
[~jkreps][~nehanarkhede][~junrao] 




was (Author: waldenchen):
Or do you prefer the second option:
Add one more column to file "recovery-point-offset-checkpoint", currently it 
only record offset, like below:
 0
 2
 mvlogs 1 100
 mvlogs 0 200
 Change to below:
 0
 2
 mvlogs 1 100 8000
 mvlogs 0 200 16000

8000 is the start position of the data file for message with offset 100 . And 
16000 is start position of the data file for message with offset 200.
 Take first one as example, what we need do are:
1, keep offset and position consistent  and regularly write to file 
"recovery-point-offset-checkpoint", 
2, when in clean shutdown,  truncate the file to the size.
3, when start, if the os is windows, add one new segment.
But this change is big, since so many places are using variable recoveryPoint.

Which one do you recommend?  Really appreciate for your guide. 
[~jkreps][~nehanarkhede][~junrao] 



> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-03-05 Thread Honghai Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14348401#comment-14348401
 ] 

Honghai Chen edited comment on KAFKA-1646 at 3/5/15 10:44 AM:
--

 Hey, [~jkreps]  just clarify, the 50MB/s you mentioned before is the checksum 
calculation on the machine, not copy replica data from other machine, right?

If that's true, seemly we need do 3 changes:
1, when call logManager.shutdown.   and os is windows ,  truncate active 
segment.
2, when start, if the os is windows,  add one new segment.
3, remove the change   " 
KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch " made 
previously since it's unnecessary.   
Make sense?




was (Author: waldenchen):
Actually we want to add one more column to file  
"recovery-point-offset-checkpoint", currently it only record offset, like below:
0
2
mvlogs 1 100
mvlogs 0 200
Change to below:
0
2
mvlogs 1 100 8000
mvlogs 0 200 16000

8000 is the start position of the data file for message with offset 100 . And 
16000 is start position of the data file for message with offset 200.
Take first one as example, when recover the last segment (in function 
LogSegment.recover(maxMessageSize: Int) ,  ONLY recover  file to 
min(validBytes, 8000)  with offset 100 and rebuild index.   Make sense ?  
[~jkreps]



> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-02-04 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14179731#comment-14179731
 ] 

Jakob Homan edited comment on KAFKA-1646 at 2/4/15 8:36 PM:


Hey, Jay, sorry for late. I have done a test and find there is no pause when 
rolling new log segment. The time for creating an 1G file and 1K file is almost 
identical: all about 1ms.
Here is the test code which creating 10 files of 1G:
{code}
public static void main(String[] args) throws Exception  {
String filePre = "d:\\temp\\file";
long startTime, elapsedTime;
startTime = System.currentTimeMillis();
try {
long initFileSize = 102400l;
for (int i = 0; i < 10; i++) {
RandomAccessFile randomAccessFile = new 
RandomAccessFile(filePre + i, "rw");
randomAccessFile.setLength(initFileSize);
randomAccessFile.getChannel();
}
elapsedTime = System.currentTimeMillis() - startTime;
System.out.format("elapsedTime: %2d ms", elapsedTime);
} catch (Exception exception) { }
}
{code}
The result is: elapsedTime: 14 ms



was (Author: xueqiang):
Hey, Jay, sorry for late. I have done a test and find there is no pause when 
rolling new log segment. The time for creating an 1G file and 1K file is almost 
identical: all about 1ms.
Here is the test code which creating 10 files of 1G:
public static void main(String[] args) throws Exception  {
String filePre = "d:\\temp\\file";
long startTime, elapsedTime;
startTime = System.currentTimeMillis();
try {
long initFileSize = 102400l;
for (int i = 0; i < 10; i++) {
RandomAccessFile randomAccessFile = new 
RandomAccessFile(filePre + i, "rw");
randomAccessFile.setLength(initFileSize);
randomAccessFile.getChannel();
}
elapsedTime = System.currentTimeMillis() - startTime;
System.out.format("elapsedTime: %2d ms", elapsedTime);
} catch (Exception exception) { }
}

The result is: elapsedTime: 14 ms


> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2015-02-04 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231258#comment-14231258
 ] 

Jakob Homan edited comment on KAFKA-1646 at 2/4/15 8:32 PM:


Hi, Jun. 
I also came from same team with Xueqiang & Honghai. 
Just as Jay mentioned before, recover the log is also necessary when the broker 
is stopped gracefully. Since we just recover the activeSegment (only one 
LogSegment), so it would not cost a lot of time. 
And I just write some code to test the performance of LogSegment.recover(),
{code}
public class TestLogSegment {
public static void main(String[] args) throws Exception {
String dirTemplate = args.length > 0? args[0] : 
"D:\\C\\scp2\\tachyon\\logs\\testbroker\\mvlogs-";
int logFileNums = args.length > 1? Integer.parseInt(args[1]) : 10;

long startOffset = 0L;
int indexIntervalBytes = 4096;
int maxIndexSize = 10485760;
long initFileSize = 536870912L;
int maxMessageSize = 100;

long totalTime = 0L;
for (int i=0; i 0? args[0] : 
"D:\\C\\scp2\\tachyon\\logs\\testbroker\\mvlogs-";
int logFileNums = args.length > 1? Integer.parseInt(args[1]) : 10;

long startOffset = 0L;
int indexIntervalBytes = 4096;
int maxIndexSize = 10485760;
long initFileSize = 536870912L;
int maxMessageSize = 100;

long totalTime = 0L;
for (int i=0; i Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>Assignee: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch, 
> KAFKA-1646-truncate-off-trailing-zeros-on-broker-restart-if-bro.patch, 
> KAFKA-1646_20141216_163008.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1646) Improve consumer read performance for Windows

2014-10-07 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162106#comment-14162106
 ] 

Jay Kreps edited comment on KAFKA-1646 at 10/7/14 6:23 PM:
---

Ah, you are saying Windows does a worse job of preallocation? How much does 
this help? Did you do any benchmarking on the performance improvement?


was (Author: jkreps):
Ah, you are saying Windows does a worse job of preallocation? Can you do some 
benchmark the performance improvement?

> Improve consumer read performance for Windows
> -
>
> Key: KAFKA-1646
> URL: https://issues.apache.org/jira/browse/KAFKA-1646
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Affects Versions: 0.8.1.1
> Environment: Windows
>Reporter: xueqiang wang
>  Labels: newbie, patch
> Attachments: Improve consumer read performance for Windows.patch
>
>
> This patch is for Window platform only. In Windows platform, if there are 
> more than one replicas writing to disk, the segment log files will not be 
> consistent in disk and then consumer reading performance will be dropped down 
> greatly. This fix allocates more disk spaces when rolling a new segment, and 
> then it will improve the consumer reading performance in NTFS file system.
> This patch doesn't affect file allocation of other filesystems, for it only 
> adds statements like 'if(Os.iswindow)' or adds methods used on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)