[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15610899#comment-15610899 ] Hadoop QA commented on HADOOP-1381: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 51s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 6m 51s{color} | {color:red} root generated 2 new + 700 unchanged - 3 fixed = 702 total (was 703) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 27s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch generated 5 new + 266 unchanged - 22 fixed = 271 total (was 288) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 50s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 42m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HADOOP-1381 | | GITHUB PR | https://github.com/apache/hadoop/pull/147 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 68fda58a8023 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9f32364 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | javac | https://builds.apache.org/job/PreCommit-HADOOP-Build/10905/artifact/patchprocess/diff-compile-javac-root.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HADOOP-Build/10905/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/10905/testReport/ | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/10905/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > The distance between sync blocks in SequenceFiles should be configurable > rather than hard coded to 2000 bytes > - > >
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608637#comment-15608637 ] ASF GitHub Bot commented on HADOOP-1381: GitHub user QwertyManiac opened a pull request: https://github.com/apache/hadoop/pull/147 HADOOP-1381. The distance between sync blocks in SequenceFiles should… … be configurable rather than hard coded to 2000 bytes. You can merge this pull request into a Git repository by running: $ git pull https://github.com/QwertyManiac/hadoop HADOOP-1381 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/147.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #147 commit dbfd4090c2d97f6dfd984c3d77ed9b78b7ea1a93 Author: Harsh JDate: 2016-10-26T14:34:33Z HADOOP-1381. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes. > The distance between sync blocks in SequenceFiles should be configurable > rather than hard coded to 2000 bytes > - > > Key: HADOOP-1381 > URL: https://issues.apache.org/jira/browse/HADOOP-1381 > Project: Hadoop Common > Issue Type: Improvement > Components: io >Affects Versions: 2.0.0-alpha >Reporter: Owen O'Malley >Assignee: Harsh J > Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, > HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, > HADOOP-1381.r5.diff > > > Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much > better if it was configurable with a much higher default (1mb or so?). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1340#comment-1340 ] Harsh J commented on HADOOP-1381: - Owen/Todd/Others, Since I've addressed the comments offered here, I'm going to commit this in by Monday EOD unless there are any further comments. bq. Otherwise looks good. Thanks! The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 2.0.0-alpha Reporter: Owen O'Malley Assignee: Harsh J Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, HADOOP-1381.r5.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284364#comment-13284364 ] Radim Kolar commented on HADOOP-1381: - default 1 meg interval should be fine The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, HADOOP-1381.r5.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284290#comment-13284290 ] Harsh J commented on HADOOP-1381: - I do think this would be useful to be able to control, especially given that use of sequence files is already prevalent. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, HADOOP-1381.r5.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273911#comment-13273911 ] Harsh J commented on HADOOP-1381: - Any further comments on the addition? The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, HADOOP-1381.r5.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269152#comment-13269152 ] Hadoop QA commented on HADOOP-1381: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12525761/HADOOP-1381.r5.diff against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/944//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/944//console This message is automatically generated. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, HADOOP-1381.r5.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269383#comment-13269383 ] Harsh J commented on HADOOP-1381: - bq. -1 javadoc. The javadoc tool appears to have generated 2 warning messages. This is HADOOP-8359, not this patch. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, HADOOP-1381.r5.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237257#comment-13237257 ] Hadoop QA commented on HADOOP-1381: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12507830/HADOOP-1381.r5.diff against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/769//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/769//console This message is automatically generated. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.24.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188908#comment-13188908 ] Harsh J commented on HADOOP-1381: - Todd/others, are there any other comments you'd like me to address? The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.24.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156660#comment-13156660 ] Harsh J commented on HADOOP-1381: - Todd, - The sync _interval_ can be arbitrary I think, can even be 0. Should not be negative, so I'll add a check for that instead. Or do you think its better if we limit the interval to a minimum? Writer tests pass with 0 no problem. - SYNC_INTERVAL is being used by MAPREDUCE right now, and I'll have to carry this out as a cross-project JIRA+patch. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.24.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156857#comment-13156857 ] Todd Lipcon commented on HADOOP-1381: - hm, if you set it to 0, it will write a sync marker between every record? Don't worry about renaming the variable, seems like too much effort for little gain. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.24.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156954#comment-13156954 ] Harsh J commented on HADOOP-1381: - Yes, it would end up writing a marker after each record, as sync-writing condition is checked after every record append. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.24.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13101565#comment-13101565 ] Todd Lipcon commented on HADOOP-1381: - Two minor nits: - Can you add a check in the Writer constructor that the syncInterval option is valid? I think the minimum value would be SYNC_SIZE? - Can you rename SYNC_INTERVAL to DEFAULT_SYNC_INTERVAL or SYNC_INTERVAL_DEFAULT? Even though it's currently public, I don't think this would be considered a public API, so changing it seems alright. Otherwise looks good. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.23.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096615#comment-13096615 ] Hadoop QA commented on HADOOP-1381: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12492885/HADOOP-1381.r4.diff against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-common-project. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/131//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/131//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-auth.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/131//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/131//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/131//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-auth-examples.html Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/131//console This message is automatically generated. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.23.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, HADOOP-1381.r3.diff, HADOOP-1381.r4.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082160#comment-13082160 ] Owen O'Malley commented on HADOOP-1381: --- I agree with Todd's points. 100mb is too big. Please use something between 100kb and 1mb as the default. You should derive from IntegerOption for SyncIntervalOption. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.23.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070773#comment-13070773 ] Todd Lipcon commented on HADOOP-1381: - - why use a boxed Integer for the sync interval instead of a normal int? - the javadoc says 100KB, but the constant value seems to be set to 100MB - do we have to change the other constructors? seems like unrelated cleanup, probably best to leave them alone since they're deprecated anyway The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.23.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13066556#comment-13066556 ] Harsh J commented on HADOOP-1381: - (Avro datafiles have the ability of interval configuration as well, if you look at AVRO-719 and related issues) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.23.0 Attachments: HADOOP-1381.r1.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13066560#comment-13066560 ] Hadoop QA commented on HADOOP-1381: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486743/HADOOP-1381.r1.diff against trunk revision 1147317. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.io.TestSequenceFileSync +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/735//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/735//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/735//console This message is automatically generated. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.23.0 Attachments: HADOOP-1381.r1.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes
[ https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13066576#comment-13066576 ] Hadoop QA commented on HADOOP-1381: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486748/HADOOP-1381.r2.diff against trunk revision 1147317. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/738//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/738//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/738//console This message is automatically generated. The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes - Key: HADOOP-1381 URL: https://issues.apache.org/jira/browse/HADOOP-1381 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 0.22.0 Reporter: Owen O'Malley Assignee: Harsh J Fix For: 0.23.0 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much better if it was configurable with a much higher default (1mb or so?). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira