[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2016-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15610899#comment-15610899
 ] 

Hadoop QA commented on HADOOP-1381:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  6m 51s{color} 
| {color:red} root generated 2 new + 700 unchanged - 3 fixed = 702 total (was 
703) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 27s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch 
generated 5 new + 266 unchanged - 22 fixed = 271 total (was 288) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
50s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 42m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HADOOP-1381 |
| GITHUB PR | https://github.com/apache/hadoop/pull/147 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 68fda58a8023 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 9f32364 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| javac | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10905/artifact/patchprocess/diff-compile-javac-root.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10905/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10905/testReport/ |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/10905/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> The distance between sync blocks in SequenceFiles should be configurable 
> rather than hard coded to 2000 bytes
> -
>
> 

[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2016-10-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608637#comment-15608637
 ] 

ASF GitHub Bot commented on HADOOP-1381:


GitHub user QwertyManiac opened a pull request:

https://github.com/apache/hadoop/pull/147

HADOOP-1381. The distance between sync blocks in SequenceFiles should…

… be configurable rather than hard coded to 2000 bytes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QwertyManiac/hadoop HADOOP-1381

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hadoop/pull/147.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #147


commit dbfd4090c2d97f6dfd984c3d77ed9b78b7ea1a93
Author: Harsh J 
Date:   2016-10-26T14:34:33Z

HADOOP-1381. The distance between sync blocks in SequenceFiles should be 
configurable rather than hard coded to 2000 bytes.




> The distance between sync blocks in SequenceFiles should be configurable 
> rather than hard coded to 2000 bytes
> -
>
> Key: HADOOP-1381
> URL: https://issues.apache.org/jira/browse/HADOOP-1381
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 2.0.0-alpha
>Reporter: Owen O'Malley
>Assignee: Harsh J
> Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
> HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
> HADOOP-1381.r5.diff
>
>
> Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
> better if it was configurable with a much higher default (1mb or so?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2012-06-23 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1340#comment-1340
 ] 

Harsh J commented on HADOOP-1381:
-

Owen/Todd/Others,

Since I've addressed the comments offered here, I'm going to commit this in by 
Monday EOD unless there are any further comments.

bq. Otherwise looks good.

Thanks!

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.0.0-alpha
Reporter: Owen O'Malley
Assignee: Harsh J
 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
 HADOOP-1381.r5.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2012-05-28 Thread Radim Kolar (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284364#comment-13284364
 ] 

Radim Kolar commented on HADOOP-1381:
-

default 1 meg interval should be fine

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
 HADOOP-1381.r5.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2012-05-27 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13284290#comment-13284290
 ] 

Harsh J commented on HADOOP-1381:
-

I do think this would be useful to be able to control, especially given that 
use of sequence files is already prevalent.

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
 HADOOP-1381.r5.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2012-05-12 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273911#comment-13273911
 ] 

Harsh J commented on HADOOP-1381:
-

Any further comments on the addition?

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
 HADOOP-1381.r5.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2012-05-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269152#comment-13269152
 ] 

Hadoop QA commented on HADOOP-1381:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12525761/HADOOP-1381.r5.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified test 
files.

-1 javadoc.  The javadoc tool appears to have generated 2 warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/944//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/944//console

This message is automatically generated.

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
 HADOOP-1381.r5.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2012-05-06 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269383#comment-13269383
 ] 

Harsh J commented on HADOOP-1381:
-

bq. -1 javadoc. The javadoc tool appears to have generated 2 warning messages.

This is HADOOP-8359, not this patch.

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
 HADOOP-1381.r5.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2012-03-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237257#comment-13237257
 ] 

Hadoop QA commented on HADOOP-1381:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12507830/HADOOP-1381.r5.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 7 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/769//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/769//console

This message is automatically generated.

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Fix For: 0.24.0

 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2012-01-18 Thread Harsh J (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13188908#comment-13188908
 ] 

Harsh J commented on HADOOP-1381:
-

Todd/others, are there any other comments you'd like me to address?

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Fix For: 0.24.0

 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2011-11-24 Thread Harsh J (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156660#comment-13156660
 ] 

Harsh J commented on HADOOP-1381:
-

Todd,

- The sync _interval_ can be arbitrary I think, can even be 0. Should not be 
negative, so I'll add a check for that instead. Or do you think its better if 
we limit the interval to a minimum? Writer tests pass with 0 no problem.
- SYNC_INTERVAL is being used by MAPREDUCE right now, and I'll have to carry 
this out as a cross-project JIRA+patch.

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Fix For: 0.24.0

 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2011-11-24 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156857#comment-13156857
 ] 

Todd Lipcon commented on HADOOP-1381:
-

hm, if you set it to 0, it will write a sync marker between every record?

Don't worry about renaming the variable, seems like too much effort for little 
gain.

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Fix For: 0.24.0

 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2011-11-24 Thread Harsh J (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156954#comment-13156954
 ] 

Harsh J commented on HADOOP-1381:
-

Yes, it would end up writing a marker after each record, as sync-writing 
condition is checked after every record append.

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Fix For: 0.24.0

 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2011-09-09 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13101565#comment-13101565
 ] 

Todd Lipcon commented on HADOOP-1381:
-

Two minor nits:
- Can you add a check in the Writer constructor that the syncInterval option is 
valid? I think the minimum value would be SYNC_SIZE?
- Can you rename SYNC_INTERVAL to DEFAULT_SYNC_INTERVAL or 
SYNC_INTERVAL_DEFAULT? Even though it's currently public, I don't think this 
would be considered a public API, so changing it seems alright.

Otherwise looks good.

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Fix For: 0.23.0

 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2011-09-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096615#comment-13096615
 ] 

Hadoop QA commented on HADOOP-1381:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12492885/HADOOP-1381.r4.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 7 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in hadoop-common-project.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/131//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/131//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-auth.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/131//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/131//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/131//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-auth-examples.html
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/131//console

This message is automatically generated.

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Fix For: 0.23.0

 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
 HADOOP-1381.r3.diff, HADOOP-1381.r4.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2011-08-10 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082160#comment-13082160
 ] 

Owen O'Malley commented on HADOOP-1381:
---

I agree with Todd's points.

100mb is too big. Please use something between 100kb and 1mb as the default.

You should derive from IntegerOption for SyncIntervalOption.



 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Fix For: 0.23.0

 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2011-07-25 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070773#comment-13070773
 ] 

Todd Lipcon commented on HADOOP-1381:
-

- why use a boxed Integer for the sync interval instead of a normal int?
- the javadoc says 100KB, but the constant value seems to be set to 100MB
- do we have to change the other constructors? seems like unrelated cleanup, 
probably best to leave them alone since they're deprecated anyway

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Fix For: 0.23.0

 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2011-07-16 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13066556#comment-13066556
 ] 

Harsh J commented on HADOOP-1381:
-

(Avro datafiles have the ability of interval configuration as well, if you look 
at AVRO-719 and related issues)

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Fix For: 0.23.0

 Attachments: HADOOP-1381.r1.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2011-07-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13066560#comment-13066560
 ] 

Hadoop QA commented on HADOOP-1381:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12486743/HADOOP-1381.r1.diff
  against trunk revision 1147317.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 7 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.io.TestSequenceFileSync

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/735//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/735//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/735//console

This message is automatically generated.

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Fix For: 0.23.0

 Attachments: HADOOP-1381.r1.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2011-07-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13066576#comment-13066576
 ] 

Hadoop QA commented on HADOOP-1381:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12486748/HADOOP-1381.r2.diff
  against trunk revision 1147317.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 7 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/738//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/738//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/738//console

This message is automatically generated.

 The distance between sync blocks in SequenceFiles should be configurable 
 rather than hard coded to 2000 bytes
 -

 Key: HADOOP-1381
 URL: https://issues.apache.org/jira/browse/HADOOP-1381
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 0.22.0
Reporter: Owen O'Malley
Assignee: Harsh J
 Fix For: 0.23.0

 Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff


 Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
 better if it was configurable with a much higher default (1mb or so?).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira