[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-16 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15285025#comment-15285025
 ] 

Chris Nauroth commented on HADOOP-13028:


HADOOP-13158 has a small follow-up bug fix.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: 2.8.0
>
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-012.patch, HADOOP-13028-013.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> HADOOP-13028-branch-2-012.patch, HADOOP-13028-branch-2-013.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281909#comment-15281909
 ] 

Hudson commented on HADOOP-13028:
-

SUCCESS: Integrated in Hadoop-trunk-Commit #9753 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9753/])
HADOOP-13028 add low level counter metrics for S3A; use in read (stevel: rev 
27c4e90efce04e1b1302f668b5eb22412e00d033)
* 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/S3AScaleTestBase.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AOutputStream.java
* 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/TestS3AInputStreamPerformance.java
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java
* hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
* hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/AnonymousAWSCredentialsProvider.java
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/BasicAWSCredentialsProvider.java
* hadoop-tools/hadoop-aws/src/test/resources/log4j.properties
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/MetricStringBuilder.java
* hadoop-tools/hadoop-aws/dev-support/findbugs-exclude.xml
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3/FileSystemStore.java
* hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
* 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/TestS3ADeleteManyFiles.java
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFastOutputStream.java
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3/S3Credentials.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MutableCounterLong.java
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInstrumentation.java
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileStatus.java
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java


> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: 2.8.0
>
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-012.patch, HADOOP-13028-013.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> HADOOP-13028-branch-2-012.patch, HADOOP-13028-branch-2-013.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-12 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281704#comment-15281704
 ] 

Colin Patrick McCabe commented on HADOOP-13028:
---

bq. This is something to bring up on the dev list, as it is something we 
essentially missed. Colin Patrick McCabe: would you care for the honour?

Sure.  I started a thread on common-dev.

bq. Steve has \[added the stability comment\] in patch v011.

Great.

Here is my +1 as well.  Thanks again, guys.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-012.patch, HADOOP-13028-013.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> HADOOP-13028-branch-2-012.patch, HADOOP-13028-branch-2-013.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-12 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281632#comment-15281632
 ] 

Steve Loughran commented on HADOOP-13028:
-

Patches working happily against branch-2 and trunk. Given patrick gave the +1 
before this final checkstyle and resync, I consider this patch voted in.

However, I'll give everyone a couple of hours to veto me from pushing it out. 

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-012.patch, HADOOP-13028-013.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> HADOOP-13028-branch-2-012.patch, HADOOP-13028-branch-2-013.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281595#comment-15281595
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 27s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 9s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 49s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 54s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 54s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 28s 
{color} | {color:red} root: The patch generated 41 new + 42 unchanged - 51 
fixed = 83 total (was 93) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s 
{color} | {color:green} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 
generated 0 new + 0 unchanged - 8 fixed = 0 total (was 8) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 35s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 45s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281578#comment-15281578
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 9m 2s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 45s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
6s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 33s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 14s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
27s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s 
{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
18s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
16s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 10s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 10s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 22s 
{color} | {color:red} root: The patch generated 40 new + 46 unchanged - 55 
fixed = 86 total (was 101) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 49 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s 
{color} | {color:green} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 
generated 0 new + 0 unchanged - 8 fixed = 0 total (was 8) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 36s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 11s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 46s 
{color} | {color:green} hadoop-common in the 

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281521#comment-15281521
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 50s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 46s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
56s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 52s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 44s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 26s 
{color} | {color:red} root: The patch generated 41 new + 43 unchanged - 51 
fixed = 84 total (was 94) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 
generated 0 new + 0 unchanged - 8 fixed = 0 total (was 8) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 38s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 48s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95.

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-12 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281459#comment-15281459
 ] 

Steve Loughran commented on HADOOP-13028:
-

Actually this has thrown up something more fundamental: we do need an explicit 
compatibility guideline & style rule about toString, something like


# toString: no guarantees. They are for logging and diagostics for people, not 
for parsing by machines. They may also nest toString values of contained 
objects, from hadoop and other libraries, neither of which contain any 
guarantees either.
# code which generates command line output for machine parsing MUST NOT use the 
toString() value to generate output which is then expected to be stable. 
# A specific method, such as {{toStringStable()}} must be used to generate the 
strings in such a situation, ideally with a regression test.

This is something to bring up on the dev list, as it is something we 
essentially missed. [~cmccabe]: would you care for the honour?

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> HADOOP-13028-branch-2-012.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281421#comment-15281421
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 32s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 30s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
5s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 15s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 16s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
26s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s 
{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
15s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
17s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 16s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 22s 
{color} | {color:red} root: The patch generated 41 new + 45 unchanged - 56 
fixed = 86 total (was 101) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 50 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 
generated 0 new + 0 unchanged - 8 fixed = 0 total (was 8) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 52s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 13s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 6s 
{color} | {color:green} hadoop-common in the patch p

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-12 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281362#comment-15281362
 ] 

Steve Loughran commented on HADOOP-13028:
-

# this output is not indented for machine parsing: it's for the logs. If we 
cannot add toString() information which provides informative details in logs, 
well, I may as well never override a toString() method with meaningful 
diagnostics information ever again.
# These are the stats of that specific stream
# I don't want the spark code to do reflection, as I don't want that code to be 
looking at the stats programmatically *at all*. This is for me and anyone else 
playing with options to see how to make code working with S3A better.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-11 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281212#comment-15281212
 ] 

Chris Nauroth commented on HADOOP-13028:


[~cmccabe], thank you for your review.

bq. Can we add a comment to toString stating that this output is not stable API 
and should not be parsed?

Steve has done this in patch v011.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281102#comment-15281102
 ] 

Colin Patrick McCabe commented on HADOOP-13028:
---

That's a good point, [~cnauroth].  I guess as long as people don't start 
treating this output as a stable API, it's reasonable to have debugging 
information there.  Can we add a comment to toString stating that this output 
is not stable API and should not be parsed?  +1 once that is done.

Thanks for working on this, [~steve_l]... it's going to be very helpful for 
running Hadoop on s3.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-11 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280983#comment-15280983
 ] 

Chris Nauroth commented on HADOOP-13028:


[~ste...@apache.org], thank you for patch v011.  That addressed my feedback.  
There is a new JavaDoc warning on {{S3AInputStream#close}}.  I'd be +1 after a 
clean-up of that and providing a patch that applies to trunk.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-11 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280865#comment-15280865
 ] 

Chris Nauroth commented on HADOOP-13028:


I'm in favor of including the stream statistics in {{S3AInputStream#toString}}. 
 This is an extension of the stream state already provided.  I would like us to 
have the ability to evolve {{toString}} output for improved diagnostics like 
this.

Typical Java best practices advise using {{toString}} output as a debugging 
aid, not as a stable format suitable for UI display or object serialization.  
HDFS-9732 is an example of a patch where I have advised against using 
{{toString}} as a serialization format and recommended migrating to a different 
method that can provide a stability guarantee.  In the future, I will strongly 
consider -1'ing patches that introduce these kinds of dependencies on 
{{toString}} output.

While reflection-based approaches are viable, especially with some helpful 
libraries, I've never heard of those projects' contributors saying that they 
like writing their code that way.  Instead, I tend to hear that it makes their 
code more awkward or introduces potential performance risks for the extra 
indirection.

Another consideration is integration with logging.  SLF4J makes it easy to pass 
along template arguments, and then SLF4J will lazily call {{toString}} based on 
the configured logging level.  If the output is hidden behind a different 
method, or even requires reflection to access it, then applications will have 
to go back to coding their own conditional checks on the log level to avoid 
potentially costly method calls.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-11 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280782#comment-15280782
 ] 

Colin Patrick McCabe commented on HADOOP-13028:
---

In the past I've written code for Spark that used reflection to make use of 
APIs that may or may not be present in Hadoop.  HBase often does this as well, 
so that it can use multiple versions of Hadoop.  It seems like this wouldn't be 
a lot of code.  Is that feasible in this case?

I just find the argument that we should overload an existing unrelated API to 
output statistics very off-putting.  It's like saying we should override 
hashCode to output the number of times the user called {{seek()}} on the 
stream.  I also find it concerning that this would be something unique to s3a 
and not present in the toString methods of any other filesystem (including the 
other s3 ones).  It feels like a gross hack.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-11 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280079#comment-15280079
 ] 

Steve Loughran commented on HADOOP-13028:
-

For some more detail, here's a spark-cloud module (WiP) test run against 2.7.1; 
duration measured in tests, stream info printed as test goes alone. There's no 
meaningful string value.
{code}
= TEST OUTPUT FOR o.a.s.cloud.s3.S3aIOSuite: 'SeekReadFully: Cost of seek 
and ReadFully' =

2016-05-11 13:54:44,462 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of stat = 189,933,000 ns
2016-05-11 13:54:44,652 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of open = 189,144,000 ns
2016-05-11 13:54:44,652 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:54:45,099 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of read() [pos = 0] = 446,564,000 ns
2016-05-11 13:54:45,100 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) -   
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:45,101 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:54:46,052 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of seek(256) [pos = 1] = 950,677,000 ns
2016-05-11 13:54:46,053 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) -   
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:46,053 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:54:46,054 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of seek(256) [pos = 256] = 22,000 ns
2016-05-11 13:54:46,054 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) -   
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:46,055 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:54:47,010 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of seek(EOF-2) [pos = 256] = 954,645,000 ns
2016-05-11 13:54:47,010 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) -   
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:47,011 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:54:47,012 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of read() [pos = 21203389] = 397,000 ns
2016-05-11 13:54:47,012 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) -   
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:47,013 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:54:49,213 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of readFully(1, byte[1]) [pos = 21203390] = 2,199,571,000 ns
2016-05-11 13:54:49,213 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) -   
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:49,214 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:54:52,487 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of readFully(1, byte[256]) [pos = 21203390] = 3,272,746,000 ns
2016-05-11 13:54:52,487 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) -   
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:52,488 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:54:55,092 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of readFully(260, byte[256]) [pos = 21203390] = 2,604,062,000 ns
2016-05-11 13:54:55,092 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) -   
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:55,093 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:54:56,825 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of readFully(1024, byte[256]) [pos = 21203390] = 1,731,421,000 ns
2016-05-11 13:54:56,825 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) -   
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:56,825 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:54:58,486 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of readFully(1536, byte[256]) [pos = 21203390] = 1,660,882,000 ns
2016-05-11 13:54:58,487 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) -   
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:54:58,487 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:55:00,635 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of readFully(8192, byte[1024]) [pos = 21203390] = 2,147,589,000 ns
2016-05-11 13:55:00,635 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) -   
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:55:00,636 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:55:02,333 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Duration of readFully(9728, byte[1024]) [pos = 21203390] = 1,697,169,000 ns
2016-05-11 13:55:02,334 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) -   
org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe
2016-05-11 13:55:02,334 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
2016-05-11 13:55:02,334 INFO  s3.S3aIOSuite (Logging.scala:logInfo(54)) - 
Du

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-11 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15279905#comment-15279905
 ] 

Steve Loughran commented on HADOOP-13028:
-

Because one place I'm using this to look at the logs and see how to tune the 
performance is in spark code which doesn't have access to those internals and 
is built against Hadoop 2.6.x anyway. It lets me have code which can be run 
with -Dhadoop.version=2.7.1 and -Dhadoop.version=2.8.0-SNAPSHOT, I can not only 
measure the duration in the spark code itself, I can see the logged info and 
see what's been happening —where things can be improved futher.

We cannot do this if the way to log this data is via a class which is package 
private and in Hadoop 2.8+ only. As requested, I've scoped that statistics 
class so that the only way to get at it is to inject code into the 
org.apache.hadoop.fs.s3a package. Do you really, really, want me to do that in 
spark code? And use introspection to get at a class it can't compile against.

Please, give me the string: it'll be better for all of us. As and when your 
colleagues sit down to look at Parquet performance on S3, they'll appreciate it.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-10 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278800#comment-15278800
 ] 

Colin Patrick McCabe commented on HADOOP-13028:
---

bq. Patrick: regarding fs.s3a.readahead.range versus calling it 
fs.s3a.readahead.default, I think "default" could be a bit confusing too. How 
about I make it clear that the if setReadahead() is set, then it supercedes any 
previous value?

Sure.

bq. I absolutely need that printing in there, otherwise the value of this patch 
is significantly reduced. If you want me to add a line like "WARNING: UNSTABLE" 
or something to that string value, I'm happy to do so. Or the output is 
published in a way that is deliberately hard to parse by machine but which we 
humans can read. But without that information, we can't so easily tell which

Perhaps I'm missing something, but why not just do this in 
{{S3AInstrumentation#InputStreamStatistics#toString}}?  I don't see why this is 
"absolutely needed" in {{S3AInputStream#toString}}.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278473#comment-15278473
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 59s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
42s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 55s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 8s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
25s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s 
{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 5s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 22s 
{color} | {color:red} root: The patch generated 40 new + 46 unchanged - 55 
fixed = 86 total (was 101) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 50 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
29s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 
generated 1 new + 0 unchanged - 8 fixed = 1 total (was 8) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 29s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 0s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_101. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 13s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-10 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278242#comment-15278242
 ] 

Steve Loughran commented on HADOOP-13028:
-

Patch 011

- address Chris's last comments and some of Patrick's.
- The test {{TestS3ABlocksize}} is passing.
- made clear in uses of {{S3AInputStream.closed}} what the concurrency 
expectations are.
- I'm still publishing the stats in the {{toString()}} values, I've just 
highlit in the javadocs that they are unstable. and even added as an @ 
attribute.
- emphasised in {{core-default.xml}} and {{aws/index.md}} that the readahead 
range field is overwritten by any {{setReadahead()}} value.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-10 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278168#comment-15278168
 ] 

Steve Loughran commented on HADOOP-13028:
-

Patrick: regarding {{fs.s3a.readahead.range}} versus calling it 
{{fs.s3a.readahead.default}}, I think "default" could be a bit confusing too. 
How about I make it clear that the if {{setReadahead()}} is set, then it 
supercedes any previous value?



> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-10 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278142#comment-15278142
 ] 

Steve Loughran commented on HADOOP-13028:
-

# I'll comment in the close().
# we should add a compatibility statement to string values "no guarantees at 
all". There's one for token printing, HDFS-9732, where we've explicitly added a 
stable string value, {{toStringStable()}} so that a CLI command gets the same 
output as before —but that was for the specific case "output of a command line 
tool". Maybe we should standardise that method with an interface and a 
guarantee "this method doesn't change, provided libraries and the JDK doesn't 
change its output underneath"
# as it stands, it's useful today as I've been looking at the printed logs in 
test runs downstream; no attempt to parse in software. Where it's invaluable 
here is: that downstream code doesn't need to be built exclusively against 
Hadoop 2.8+, or get access to an API we've agreed to hide. For example: 
SPARK-7481.

I absolutely need that printing in there, otherwise the value of this patch is 
significantly reduced. If you want me to add a line like "WARNING: UNSTABLE" or 
something to that string value, I'm happy to do so. Or the output is published 
in a way that is deliberately hard to parse by machine but which we humans can 
read. But without that information, we can't so easily tell which

If you do insist on that string being pulled, then I'm going to convert the 
statistics to being a globally accessible object instead, albeit tagged as 
@Unstable and LimitedPrivate("Testing"). 

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276778#comment-15276778
 ] 

Colin Patrick McCabe commented on HADOOP-13028:
---

bq. I think this is OK. The whole close method is synchronized, so we won't 
have two threads concurrently doing the actual close. Almost all other accesses 
of closed are within synchronized methods too. It's marked volatile to help 
with one unsynchronized access from readFully, calling into checkNotClosed. 
That's only a read, not an update, so volatile is sufficient.

Thanks for the explanation.  I missed the interaction between {{synchronized}} 
and the assignment.  Suggest adding a comment to the assignment in {{close()}} 
explaining why this is atomic, or simply using AtomicBoolean to future-proof 
this against later code changes.

bq. I'd like to keep \[the toString changes\]. It's very convenient for 
logging. TestS3AInputStreamPerformance uses it for both logging output and 
detailed assertion messages. It's poor practice to rely on a Java object's 
toString output as a stable, parseable format. This is something that I'd like 
to see clarified in our compatibility documentation.

The problem is, this is not consistent with how {{toString}} operates in other 
FS streams.  We also don't have anything in our compatibility documentation 
stating that the output of {{toString}} is not a stable, parseable format.  
We've had many, many JIRAs to "make toString act like some previous behavior" 
for various Hadoop classes.  I think we need to accept that currently the 
stream's {{toString}} method is viewed as a public, stable API whether we like 
it or not.

How about just adding this information to the {{toString}} method of the stream 
statistics object?  That makes more sense anyway.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-06 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274739#comment-15274739
 ] 

Chris Nauroth commented on HADOOP-13028:


bq. {{S3AInputStream#closed}}: it seems like this should be an 
{{AtomicBoolean}}. Otherwise two threads could both enter this code block, 
right?

I think this is OK.  The whole {{close}} method is {{synchronized}}, so we 
won't have two threads concurrently doing the actual close.  Almost all other 
accesses of {{closed}} are within {{synchronized}} methods too.  It's marked 
{{volatile}} to help with one unsynchronized access from {{readFully}}, calling 
into {{checkNotClosed}}.  That's only a read, not an update, so {{volatile}} is 
sufficient.

I don't object to using {{AtomicBoolean}}, but I don't think it's necessary.

bq. Is it really necessary to put statistics information into the toString 
methods of the streams?

I'd like to keep this.  It's very convenient for logging.  
{{TestS3AInputStreamPerformance}} uses it for both logging output and detailed 
assertion messages.  It's poor practice to rely on a Java object's {{toString}} 
output as a stable, parseable format.  This is something that I'd like to see 
clarified in our compatibility documentation.

I don't have a strong opinion on the naming questions.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274641#comment-15274641
 ] 

Colin Patrick McCabe commented on HADOOP-13028:
---

{code}
926 
927   fs.s3a.readahead.range
928   65536
929   Bytes to read ahead during a seek() before closing and
930   re-opening the S3 HTTP connection.
931 
{code}
Hmm, should this be {{fs.s3a.readahead.default}}?  It seems like this is the 
default if the user doesn't call {{FSDataInputStream#setReadahead}},

{{S3AInputStream#closed}}: it seems like this should be an {{AtomicBoolean}}.  
Otherwise two threads could both enter this code block, right?
{code}
362 if (!closed) {
363   closed = true;
364   super.close();
365   closeStream("close() operation", this.contentLength);
366   streamStatistics.close();
367 }
{code}

{code}
  public S3AInstrumentation.InputStreamStatistics getStreamStatistics() {
{code}
Maybe should be called {{getS3StreamStatistics}}, reflecting the fact that this 
API is s3-specific?

Is it really necessary to put statistics information into the {{toString}} 
methods of the streams?  It seems like this could lead to compatibility woes, 
and we have the API described above to provide this information anyway.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-06 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274562#comment-15274562
 ] 

Chris Nauroth commented on HADOOP-13028:


Patch v010 addresses all of my prior feedback.

It's fine to keep {{S3AInstrumentation#gauge}}.

I agree that "must be private and have accessor methods" is pedantic.  I think 
we can control this with the Checkstyle 
[HiddenField|http://checkstyle.sourceforge.net/config_coding.html#HiddenField] 
rule settings.  Let's not bother for this patch though.  You've already done a 
lot to clean up Checkstyle warnings.

I agree with updating the JIRA title and closing down the other ones that were 
folded into this patch.

I'm now seeing a failure in {{TestS3ABlocksize#testBlockSize}}:
{code}
testBlockSize(org.apache.hadoop.fs.s3a.TestS3ABlocksize)  Time elapsed: 3.17 
sec  <<< FAILURE!
java.lang.AssertionError: Double default block size in stat(): 
S3AFileStatus{path=s3a://cnauroth-test-aws-s3a/test/testBlockSize/file; 
isDirectory=false; length=1024; replication=1; blocksize=33554432; 
modification_time=1462559933000; access_time=0; owner=; group=; 
permission=rw-rw-rw-; isSymlink=false} expected:<67108864> but was:<33554432>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at 
org.apache.hadoop.fs.s3a.TestS3ABlocksize.testBlockSize(TestS3ABlocksize.java:65)
{code}

It looks like the test has an expectation that you can update block size in the 
{{Configuration}} of an already-initialized {{S3AFileSystem}}, and it will 
start using the new value.  After this patch, the block size is read from 
{{Configuration}} once and cached.  I think the expectation of the test is a 
little dubious, but that's the behavior that shipped in 2.7.0.  It's probably 
safest to remove this part of the patch for now.

I just have one more request.  Please remove the unused import of 
{{Configuration}} in {{S3AInstrumentation}}.

I'll be +1 after that.  We'll also need a separate trunk patch.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274197#comment-15274197
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 44s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 59s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
10s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 47s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 27s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
14s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s 
{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
15s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
30s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 46s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 7s 
{color} | {color:red} root: The patch generated 21 new + 42 unchanged - 51 
fixed = 63 total (was 93) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 50 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 
generated 0 new + 0 unchanged - 8 fixed = 0 total (was 8) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 44s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 33s 
{color} | {color:green} hadoop-common in the

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-06 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274014#comment-15274014
 ] 

Steve Loughran commented on HADOOP-13028:
-

patch 010, addressing chris's previous comments:



1. visibility: done.Added interface scope & stability attributes to all the S3a 
classes. Everything is marked as {{@Private}}; stability either {{@Evolving}} 
for subclasses of standard classes; {{@Unstable}} for all the metrics, and 
{{Stable}} for those classes which are used in the AWS toolkit. 

The {{org.apache.hadoop.fs.s3a.Constants}} is tagged as {{Stable, Evolving}} as 
it is intended for public use and must only add new attributes, not remove or 
change existing key names. 

2. {{initMultipartUploads}} error ignored increment: done. I also renamed the 
operation {{errorIgnored()}} to give it more of a verb-style use.
3. Rename {{catch}} handlers. These are all 404 handling, as far as I can see, 
and really part of the legit workflow. What 
I have done is made sure those handlers log at debug.
4. done in both places.
5. done. Also looked at the others, added it everywhere, removed the 
{{LOG.isDebug}} wrappers for all the low-cost statements.
6. done
7. done. Also moved the blocksize get into a field; print that out too, fail 
init if the block is ever zero or less
8. OK: removed
9. done: setReadahead(null) -> default.
10. that came from the Azure stuff, but as its not used here (no looking at 
rolling window stuff) I've cut it.
11. If I could find a useful gauge, I would use it, so can we keep it in there 
until needed? Or cull it and re-implement as required?
12. done

In {{copyFromLocalFile}} I do the  {{statistics.incrementWriteOps(1)}} ahead of 
calling the transfer. That way, if the call failed, the counter still goes up.

Note that in {{copyFile}} the counter is actually going up twice, as the 
progress listener incs it too. I think we should pull the {{incrementWriteOps}} 
call up ahead of {{transfers.copy}} and remove the one from the progress 
listener.

maybe also in {{createEmptyObject}}


Checkstyle is whining about the inner stats fields not having accessors; I'm 
ignoring that on a point of principle: it's being over fussy.
To keep it happy I went and fixed up line width across much of 
{{S3AFilesystem}}. That'll keep the number down.

Added validation of minimum values to all the various arguments 
(buffer/partition sizes, timeouts, etc), so that
negative values -> {{IllegalArgumentException}}. Really we should add that 
option to {{Configuration}}, so that people don't
forget to add those checks in their own code (and how much of Hadoop doesn't 
...). Any (shipped) option where a too-low value
is fixed to be a minimum is left alone for compatibility. Some of the options 
were checked downstream (e.g some of the socket options);
now the exception raised includes the configuration key at fault.

Fixed all javadoc errors. JDK8 fails on any error, so I've touched some files 
{{S3AFastOutputStream}}, {{S3Credentials}} 
{{org.apache.hadoop.fs.s3.FileSystemStore}} because of the javadoc problem.

Note that the title of this JIRA is getting less accurate, more now: instrument 
S3a, implement readahead, validate input params and wrap up other issues. Those 
other JIRAs contained could be closed as fixed for explicitness.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> HADOOP-13028-branch-2-010.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-06 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273974#comment-15273974
 ] 

Steve Loughran commented on HADOOP-13028:
-

sorry, missed all those. working on them and the various patch complaints

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-05 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273296#comment-15273296
 ] 

Chris Nauroth commented on HADOOP-13028:


[~ste...@apache.org], the changes in patch v009 look good to me.  I think this 
is close to being complete.

There was an earlier round of feedback from me that has not yet been addressed. 
 This was small nitpicky stuff, nothing as tricky as the actual seek logic.  
Here is a direct link to that comment.

https://issues.apache.org/jira/browse/HADOOP-13028?focusedCommentId=15267400&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15267400

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272880#comment-15272880
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 13s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
37s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 36s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 25s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
5s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s 
{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s 
{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
0s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 27s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 4s 
{color} | {color:red} root: The patch generated 16 new + 49 unchanged - 40 
fixed = 65 total (was 89) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 50 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
31s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-aws in the patch failed with JDK v1.8.0_91. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 14s 
{color} | {color:red} hadoop-tools_hadoop-aws-jdk1.7.0_95 with JDK v1.7.0_95 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 30s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 29s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 13s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.7.0_95

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272509#comment-15272509
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} 
| {color:red} HADOOP-13028 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12802441/HADOOP-13028-009.patch
 |
| JIRA Issue | HADOOP-13028 |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/9291/console |
| Powered by | Apache Yetus 0.3.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, 
> HADOOP-13028-branch-2-008.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-05 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272259#comment-15272259
 ] 

Steve Loughran commented on HADOOP-13028:
-

OK, so we can just get away with {{Long#MAX_VALUE}} as the len? That's a nice 
trick to consider in future *as it would avoid us having to know the length of 
the blob up front, which could potentially eliminate the up-front check until 
that first read (which is the lazy-open idea I've mentioned elsewhere)

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, 
> HADOOP-13028-branch-2-008.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-04 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271057#comment-15271057
 ] 

Chris Nauroth commented on HADOOP-13028:


bq. Really, we should be asking for the whole thing, shouldn't we?

That's exactly what I was thinking.  If we might later decide to keep reading 
forward, possibly to any arbitrary point, then there should be no need for a 
complex calculation of the endpoint.

bq. I think the http content-range call does require you to specify a limit, so 
file-len is always required, but that can be enough

It does seem to be required.  The master branch of the AWS SDK has a new 
single-arg {{setRange}} method that just accepts the beginning point.  This 
isn't available in our current dependency version.  I see that the 
implementation just maps this to {{Long#MAX_VALUE}} as the endpoint.

https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/model/GetObjectRequest.java#L426-L428


> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, 
> HADOOP-13028-branch-2-008.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-04 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270986#comment-15270986
 ] 

Steve Loughran commented on HADOOP-13028:
-

1. fixed

3. counting backwards seeks I'm just ignoring that value for now. left it in in 
case we ever wanted to start tracking these things. One interesting question 
about all seek + read stats, is really histograms of requests would be the best 
metric; not just the aggregates.

2. let me review that code. In fact, maybe I should factor it out for some 
independent checks. Really, we should be asking for the whole thing, shouldn't 
we? Because even irrespective of the amount you want in the current read() 
call, you don't want to have to re-open just because you didn't know the 
initial amount, do you? 

I think the http content-range call does require you to specify a limit, so 
file-len is always required, but that can be enough



> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, 
> HADOOP-13028-branch-2-008.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-02 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267729#comment-15267729
 ] 

Chris Nauroth commented on HADOOP-13028:


[~ste...@apache.org], I've spent more time reading the seek code changes, and 
I'm pretty confident that they're correct overall, but I have a few more 
comments.

# {{S3AInputStream#closeStream}} has the following log message.  The text of 
the message indicates that it's logging {{contentLength}}, but really it's 
logging {{length}}.  I imagine {{length}} is really the more interesting thing 
here, and the message text should be changed?
{code}
  LOG.debug("Stream {} {}: {}; streamPos={}, nextReadPos={}," +
  " contentLength={}",
  uri, (shouldAbort ? "aborted":"closed"), reason, pos, nextReadPos,
  length);
{code}
# Actually, that makes me realize I am unclear about a change made in 
HADOOP-12444.  {{S3AInputStream#reopen}} has a stream length calculation that 
gets passed into the range request.
{code}
requestedStreamLen = (length < 0) ? this.contentLength :
Math.max(this.contentLength, (CLOSE_THRESHOLD + (targetPos + length)));
...
GetObjectRequest request = new GetObjectRequest(bucket, key)
.withRange(targetPos, requestedStreamLen);
{code}
Please tell me if I'm misunderstanding something, but I believe this 
calculation always results in an upper bound on the range that effectively 
means "get the whole thing."  That {{Math.max}} call guarantees that the value 
is always at least {{contentLength}}, which is the whole file length.  Is this 
a bug in the HADOOP-12444 patch?
# {{InputStreamStatistics#seekBackwards}} accepts {{offset}} as an argument but 
doesn't use it.  Is there supposed to be another counter for back-skipped 
bytes?  At the call site within {{S3AInputStream#seekInStream}}, the value it 
passes would be negative, so we'd need to be careful of that.


> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, 
> HADOOP-13028-branch-2-008.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-02 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267400#comment-15267400
 ] 

Chris Nauroth commented on HADOOP-13028:


Hello [~ste...@apache.org].  I'm still digging into the changes in the seek 
code and the tests, but I'd like to share the feedback I have so far for patch 
v008.

# Let's put visibility annotations on {{MetricsRecordBuilder}}.  
Public/Evolving for agreement with the {{MetricsRecordBuilder}} base class?
# Should {{initMultipartUploads}} increment the ignored error counter?
# {{rename}} has several {{catch}} blocks that don't propagate an exception.  
Should these increment the ignored error counter?
# In the following exception message, I think we need an extra space before the 
"to".  There are 2 different call sites that produce this message, so 2 spots 
to fix.
{code}
  throw new InterruptedIOException("Interrupted copying " + src
  + "to "  + dst + ", cancelling");
{code}
# Can you please include {{src}} and {{dst}} in this log message from 
{{rename}}?
{code}
  LOG.debug("rename: src or dst are empty");
{code}
# Can you please include {{key}} in this log message from {{delete}}?
{code}
LOG.debug("Deleting fake empty directory");
{code}
# Should {{S3AFileSystem#toString}} also include {{maxKeys}}, {{cannedACL}} and 
{{readAhead}}?
# In {{S3AInputStream#reopen}}, is the following log message redundant, 
considering the call to {{closeStream}} will do its own logging?
{code}
  LOG.debug("Closing the previous stream");
  closeStream("reopen(" + reason + ")", requestedStreamLen);
{code}
# {{S3AInputStream#setReadahead}} doesn't exactly match the specification 
defined in {{CanSetReadahead}}.  The interface says that {{null}} means to use 
the default, but the implementation here rejects {{null}}.  This could be 
problematic for more complex use cases, such as someone wanting to 
programmatically control the amount of readahead.  If they called 
{{setReadahead}} with a custom value, then I think ideally we should allow them 
to call it with {{null}} later, and restore back to the default from 
configuration.  (I admit this is an edge case, but a {{DFSInputStream}} does 
allow this behavior.)
# {{S3AInstrumentation}} receives a {{Configuration}} in its constructor but 
doesn't use it.  Can it be removed?
# {{S3AInstrumentation#gauge}} appears to be unused.
# {{InputStreamStatistics#toString}} does not include {{readFullyOperations}}.

It looks like there are some CheckStyle and JavaDoc things to follow up on from 
that last pre-commit run.  The test failure is unrelated.


> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, 
> HADOOP-13028-branch-2-008.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265911#comment-15265911
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 26s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 4m 44s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
48s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 12s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 5s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
8s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s 
{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
21s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
17s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 7s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 2s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 2s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 2s 
{color} | {color:red} root: The patch generated 15 new + 49 unchanged - 40 
fixed = 64 total (was 89) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 50 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
31s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-aws in the patch failed with JDK v1.8.0_91. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 14s 
{color} | {color:red} hadoop-tools_hadoop-aws-jdk1.7.0_95 with JDK v1.7.0_95 
generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 46s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 28s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 14s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265815#comment-15265815
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} HADOOP-13028 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12801655/HADOOP-13028-008.patch
 |
| JIRA Issue | HADOOP-13028 |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/9243/console |
| Powered by | Apache Yetus 0.3.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265742#comment-15265742
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 0s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 39s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 35s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
6s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 36s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 36s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 36s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 7s 
{color} | {color:red} root: The patch generated 32 new + 53 unchanged - 4 fixed 
= 85 total (was 57) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 46s 
{color} | {color:red} hadoop-tools/hadoop-aws generated 15 new + 0 unchanged - 
0 fixed = 15 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-aws in the patch failed with JDK v1.8.0_92. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 15s 
{color} | {color:red} hadoop-tools_hadoop-aws-jdk1.7.0_95 with JDK v1.7.0_95 
generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 16m 51s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_92. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 15s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_92. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 24s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 14s 
{color} | {color:green} hadoop-aws in the 

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-05-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265701#comment-15265701
 ] 

Steve Loughran commented on HADOOP-13028:
-

+there's another contained patch, HADOOP-13058 , about enabling multipart 
upload failing against R/O buckets. This was blocking test runs against a read 
only AWS dataset, fixed by catching and downgrading the fault to a warn. You 
can't upload to a read only bucket anyway, after all

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265416#comment-15265416
 ] 

Steve Loughran commented on HADOOP-13028:
-

Chris, I'm going to submit a patch with the logging enhanced.

It's not going to cover robustness of removeKeys as that turns out be 
complicated enough to merit its own JIRA (HADOOP-12844), it's own patch, review 
*and real test*.

in particular, as a way to create the problem is clearly "delete one of the 
keys", we'll need the deletion code to not only skip over the failure of 
individual files, —but to not treat the absence of a file as an error. That 
means a test to create the failure condition on both single and multidelete, 
see what comes back and then add the code to handle it (if there isn't enough 
information in the exception error code, that means probing for each key's 
existence).

Then the method needs to handle the situation "other failures", which could be 
handled by rethrowing the exception (or one of the exceptions, for 
single-file-delete failures, or otherwise reporting the failures to the caller 
(e.g: return a list of nondeleted files). And that caller code needs to decide 
what to do, which could vary between {{rename()}} and {{delete()}}

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265383#comment-15265383
 ] 

Steve Loughran commented on HADOOP-13028:
-

regarding the contained patches, the IOE handling patch HADOOP-12844, is a 
direct precursor, I just optimised the implementation by moving the exiting 
handler for socket exceptions to after the EOF handler, and expanded the check 
to all IOEs. You can look at the patch there and think "would that work?" We 
don't have any test checking this failure path (who fancies writing some fault 
injection mocking?), so a review matters there.

The forward seek buffering code is very different; this is the code to 
consider. it does a lot of thinking about how far to seek

# if the forward length is in the {{available()}} range, that is already 
received, *always read forwards*. That's irrespective of requested range.
# otherwise, min of (bytes-remaining, buffer size)
# with counters of times of forward/backward seeks, and how many bytes were 
skipped during forward seeks
# there are tests

So: review this code directly.

I'll look at the logging and remove keys code. There's already an open JIRA on 
failures of deletes after a rename, which I was hoping to have addressed 
elsewhere.


> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-29 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263762#comment-15263762
 ] 

Steve Loughran commented on HADOOP-13028:
-

I'm only looking at input, initially

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-28 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263297#comment-15263297
 ] 

Chris Nauroth commented on HADOOP-13028:


[~ste...@apache.org], also, any instrumentation planned on the output stream 
side, or do you want to keep scope focused on the input stream side here?

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-28 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263292#comment-15263292
 ] 

Chris Nauroth commented on HADOOP-13028:


Hello [~ste...@apache.org].  This looks very useful overall.

I'm a bit confused, because it seems different iterations of the patch have 
folded in fixes from other JIRAs.  Can you please clarify for reviewers if we 
should be reviewing other patches first?

Since the patch is touching some {{LOG.debug}} statements, would it be helpful 
to include {{src}} and {{dst}} in those log message?

{{S3AFileSystem#removeKeys}} appears to have some subtle bugs.  This is not 
entirely related to your patch.  The multi-delete might fail with some objects 
successfully deleted but others remaining.  However, the stats only increment 
if the whole multi-delete succeeded.

http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Client.html#deleteObjects(com.amazonaws.services.s3.model.DeleteObjectsRequest)

http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/MultiObjectDeleteException.html

Similarly, if multi-delete is disabled, then any individual delete in the loop 
might throw an exception and skip the stats increments.

I'll wait for clarification on the question on pre-requisite patches before I 
take this for a test run myself.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-27 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261354#comment-15261354
 ] 

Colin Patrick McCabe commented on HADOOP-13028:
---

Thanks, [~steve_l].  I withdraw my -1, provided we don't add any new public 
APIs in this patch.

I'm out tomorrow and Friday but hopefully I'll have a chance to review it next 
week (if someone doesn't review it first).

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15260128#comment-15260128
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 5m 6s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 2s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 36s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
10s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
10s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 40s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 5s 
{color} | {color:red} root: The patch generated 34 new + 55 unchanged - 2 fixed 
= 89 total (was 57) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 44s 
{color} | {color:red} hadoop-tools/hadoop-aws generated 15 new + 0 unchanged - 
0 fixed = 15 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-aws in the patch failed with JDK v1.8.0_92. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 15s 
{color} | {color:red} hadoop-tools_hadoop-aws-jdk1.7.0_95 with JDK v1.7.0_95 
generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 39s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_92. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_92. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 50s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 14s 
{color} | {color:green}

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-27 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15260052#comment-15260052
 ] 

Steve Loughran commented on HADOOP-13028:
-

that was {{fs.list("/")}} before JIRA decides that ( + / + ) was a tick mark

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-27 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259943#comment-15259943
 ] 

Steve Loughran commented on HADOOP-13028:
-

Noted. 

In the meantime, if you have colleagues who work on S3 related deployments, 
perhaps they could have a look and review the patch overall, that is, 
irrespective of API: does it work

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-26 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258713#comment-15258713
 ] 

Colin Patrick McCabe commented on HADOOP-13028:
---

It looks really good, [~steve_l].

Just to avoid misunderstandings, I'll drop a -1 here until we finish discussing 
what the interface should be... 
I look forward to giving this a review as soon as we figure that out.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-26 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258651#comment-15258651
 ] 

Steve Loughran commented on HADOOP-13028:
-

Note the actual fix is to force a {{fs.list(/)}} after creating the FS; the 
failure of the first operation is viewed as transient and downgraded to a 
warning.

This is an interesting problem: we really only want to swallow transient 
network failures, not other issues. But any other issues will surface the next 
time someone tries to use the instance; by moving the checks out of the 
{{initialize()}} method we stop the FS setup itself breaking.



> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-26 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258265#comment-15258265
 ] 

Steve Loughran commented on HADOOP-13028:
-

This patch (the HADOOP-13059 robust init) bit breaks the proxy tests in 
{{TestS3AConfiguration}}; looks like those tests failed because the bucket 
check triggered a failure on proxy invocation

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-26 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257830#comment-15257830
 ] 

Steve Loughran commented on HADOOP-13028:
-

checkstyle are all about code going ++ on volatiles. The InputStream API says 
"single thread only", and while we know HBase ignores that, we also know that 
HBase cannot ever work on S3, and also that these are just little counters, 
nothing critical...if someone does break the threading rules, well, the 
counters will end up inaccurate.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257096#comment-15257096
 ] 

Hadoop QA commented on HADOOP-13028:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
1s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
46s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 6s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 49s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
5s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 2s 
{color} | {color:green} the patch passed with JDK v1.8.0_92 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 54s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 54s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 3s 
{color} | {color:red} root: patch generated 44 new + 53 unchanged - 2 fixed = 
97 total (was 55) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 44s 
{color} | {color:red} hadoop-tools/hadoop-aws generated 12 new + 0 unchanged - 
0 fixed = 12 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 13s 
{color} | {color:red} hadoop-aws in the patch failed with JDK v1.8.0_92. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 3m 50s 
{color} | {color:red} hadoop-tools_hadoop-aws-jdk1.7.0_95 with JDK v1.7.0_95 
generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 56s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_92. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 14s 
{color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_92. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 10s {color} 
| {color:red} hadoop-common in the patch failed with

[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests

2016-04-25 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256884#comment-15256884
 ] 

Steve Loughran commented on HADOOP-13028:
-

Patch -004

includes HADOOP-13047; forward read range is configurable. Default is 64K; 
we'll need tests to work out what is good in different deployments (in-EC2; 
remote). For my tests, 640K looks right.

There's a lot of tests for the seek behaviour; seeks with no read to verify 
lazy seek, then some seek+read sequences to see how things slow down on 
different readahead values.

BTW, the readahead can be set on an open stream via 
{{CanSetReadahead.setReadahead(Long)}}; this could enable some code to 
dynamically tune things if it really knew what it was doing. I'm using it in 
the tests to simplify their setup.

> add low level counter metrics for S3A; use in read performance tests
> 
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, metrics
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)