[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15285025#comment-15285025 ] Chris Nauroth commented on HADOOP-13028: HADOOP-13158 has a small follow-up bug fix. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Fix For: 2.8.0 > > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-012.patch, HADOOP-13028-013.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > HADOOP-13028-branch-2-012.patch, HADOOP-13028-branch-2-013.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281909#comment-15281909 ] Hudson commented on HADOOP-13028: - SUCCESS: Integrated in Hadoop-trunk-Commit #9753 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9753/]) HADOOP-13028 add low level counter metrics for S3A; use in read (stevel: rev 27c4e90efce04e1b1302f668b5eb22412e00d033) * hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/S3AScaleTestBase.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AOutputStream.java * hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/TestS3AInputStreamPerformance.java * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java * hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md * hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/AnonymousAWSCredentialsProvider.java * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/BasicAWSCredentialsProvider.java * hadoop-tools/hadoop-aws/src/test/resources/log4j.properties * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/MetricStringBuilder.java * hadoop-tools/hadoop-aws/dev-support/findbugs-exclude.xml * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3/FileSystemStore.java * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java * hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/TestS3ADeleteManyFiles.java * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFastOutputStream.java * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3/S3Credentials.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MutableCounterLong.java * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInstrumentation.java * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileStatus.java * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Fix For: 2.8.0 > > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-012.patch, HADOOP-13028-013.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > HADOOP-13028-branch-2-012.patch, HADOOP-13028-branch-2-013.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281704#comment-15281704 ] Colin Patrick McCabe commented on HADOOP-13028: --- bq. This is something to bring up on the dev list, as it is something we essentially missed. Colin Patrick McCabe: would you care for the honour? Sure. I started a thread on common-dev. bq. Steve has \[added the stability comment\] in patch v011. Great. Here is my +1 as well. Thanks again, guys. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-012.patch, HADOOP-13028-013.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > HADOOP-13028-branch-2-012.patch, HADOOP-13028-branch-2-013.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281632#comment-15281632 ] Steve Loughran commented on HADOOP-13028: - Patches working happily against branch-2 and trunk. Given patrick gave the +1 before this final checkstyle and resync, I consider this patch voted in. However, I'll give everyone a couple of hours to veto me from pushing it out. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-012.patch, HADOOP-13028-013.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > HADOOP-13028-branch-2-012.patch, HADOOP-13028-branch-2-013.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281595#comment-15281595 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 27s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 9s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 49s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 49s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 54s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 54s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 28s {color} | {color:red} root: The patch generated 41 new + 42 unchanged - 51 fixed = 83 total (was 93) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 generated 0 new + 0 unchanged - 8 fixed = 0 total (was 8) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 35s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 45s {color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95.
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281578#comment-15281578 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 9m 2s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 45s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 6s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 33s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 14s {color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 27s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 18s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 16s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 19s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 10s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 10s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 22s {color} | {color:red} root: The patch generated 40 new + 46 unchanged - 55 fixed = 86 total (was 101) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 49 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s {color} | {color:green} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 generated 0 new + 0 unchanged - 8 fixed = 0 total (was 8) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 36s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 11s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 46s {color} | {color:green} hadoop-common in the
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281521#comment-15281521 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 50s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 46s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 52s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 44s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 44s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 26s {color} | {color:red} root: The patch generated 41 new + 43 unchanged - 51 fixed = 84 total (was 94) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 generated 0 new + 0 unchanged - 8 fixed = 0 total (was 8) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 38s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 48s {color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95.
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281459#comment-15281459 ] Steve Loughran commented on HADOOP-13028: - Actually this has thrown up something more fundamental: we do need an explicit compatibility guideline & style rule about toString, something like # toString: no guarantees. They are for logging and diagostics for people, not for parsing by machines. They may also nest toString values of contained objects, from hadoop and other libraries, neither of which contain any guarantees either. # code which generates command line output for machine parsing MUST NOT use the toString() value to generate output which is then expected to be stable. # A specific method, such as {{toStringStable()}} must be used to generate the strings in such a situation, ideally with a regression test. This is something to bring up on the dev list, as it is something we essentially missed. [~cmccabe]: would you care for the honour? > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > HADOOP-13028-branch-2-012.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281421#comment-15281421 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 32s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 30s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 5s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 15s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 16s {color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 15s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s {color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 16s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 16s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 16s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 22s {color} | {color:red} root: The patch generated 41 new + 45 unchanged - 56 fixed = 86 total (was 101) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 50 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 generated 0 new + 0 unchanged - 8 fixed = 0 total (was 8) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 52s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 13s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 6s {color} | {color:green} hadoop-common in the patch p
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281362#comment-15281362 ] Steve Loughran commented on HADOOP-13028: - # this output is not indented for machine parsing: it's for the logs. If we cannot add toString() information which provides informative details in logs, well, I may as well never override a toString() method with meaningful diagnostics information ever again. # These are the stats of that specific stream # I don't want the spark code to do reflection, as I don't want that code to be looking at the stats programmatically *at all*. This is for me and anyone else playing with options to see how to make code working with S3A better. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281212#comment-15281212 ] Chris Nauroth commented on HADOOP-13028: [~cmccabe], thank you for your review. bq. Can we add a comment to toString stating that this output is not stable API and should not be parsed? Steve has done this in patch v011. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281102#comment-15281102 ] Colin Patrick McCabe commented on HADOOP-13028: --- That's a good point, [~cnauroth]. I guess as long as people don't start treating this output as a stable API, it's reasonable to have debugging information there. Can we add a comment to toString stating that this output is not stable API and should not be parsed? +1 once that is done. Thanks for working on this, [~steve_l]... it's going to be very helpful for running Hadoop on s3. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280983#comment-15280983 ] Chris Nauroth commented on HADOOP-13028: [~ste...@apache.org], thank you for patch v011. That addressed my feedback. There is a new JavaDoc warning on {{S3AInputStream#close}}. I'd be +1 after a clean-up of that and providing a patch that applies to trunk. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280865#comment-15280865 ] Chris Nauroth commented on HADOOP-13028: I'm in favor of including the stream statistics in {{S3AInputStream#toString}}. This is an extension of the stream state already provided. I would like us to have the ability to evolve {{toString}} output for improved diagnostics like this. Typical Java best practices advise using {{toString}} output as a debugging aid, not as a stable format suitable for UI display or object serialization. HDFS-9732 is an example of a patch where I have advised against using {{toString}} as a serialization format and recommended migrating to a different method that can provide a stability guarantee. In the future, I will strongly consider -1'ing patches that introduce these kinds of dependencies on {{toString}} output. While reflection-based approaches are viable, especially with some helpful libraries, I've never heard of those projects' contributors saying that they like writing their code that way. Instead, I tend to hear that it makes their code more awkward or introduces potential performance risks for the extra indirection. Another consideration is integration with logging. SLF4J makes it easy to pass along template arguments, and then SLF4J will lazily call {{toString}} based on the configured logging level. If the output is hidden behind a different method, or even requires reflection to access it, then applications will have to go back to coding their own conditional checks on the log level to avoid potentially costly method calls. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280782#comment-15280782 ] Colin Patrick McCabe commented on HADOOP-13028: --- In the past I've written code for Spark that used reflection to make use of APIs that may or may not be present in Hadoop. HBase often does this as well, so that it can use multiple versions of Hadoop. It seems like this wouldn't be a lot of code. Is that feasible in this case? I just find the argument that we should overload an existing unrelated API to output statistics very off-putting. It's like saying we should override hashCode to output the number of times the user called {{seek()}} on the stream. I also find it concerning that this would be something unique to s3a and not present in the toString methods of any other filesystem (including the other s3 ones). It feels like a gross hack. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280079#comment-15280079 ] Steve Loughran commented on HADOOP-13028: - For some more detail, here's a spark-cloud module (WiP) test run against 2.7.1; duration measured in tests, stream info printed as test goes alone. There's no meaningful string value. {code} = TEST OUTPUT FOR o.a.s.cloud.s3.S3aIOSuite: 'SeekReadFully: Cost of seek and ReadFully' = 2016-05-11 13:54:44,462 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of stat = 189,933,000 ns 2016-05-11 13:54:44,652 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of open = 189,144,000 ns 2016-05-11 13:54:44,652 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:54:45,099 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of read() [pos = 0] = 446,564,000 ns 2016-05-11 13:54:45,100 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe 2016-05-11 13:54:45,101 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:54:46,052 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of seek(256) [pos = 1] = 950,677,000 ns 2016-05-11 13:54:46,053 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe 2016-05-11 13:54:46,053 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:54:46,054 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of seek(256) [pos = 256] = 22,000 ns 2016-05-11 13:54:46,054 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe 2016-05-11 13:54:46,055 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:54:47,010 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of seek(EOF-2) [pos = 256] = 954,645,000 ns 2016-05-11 13:54:47,010 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe 2016-05-11 13:54:47,011 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:54:47,012 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of read() [pos = 21203389] = 397,000 ns 2016-05-11 13:54:47,012 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe 2016-05-11 13:54:47,013 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:54:49,213 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of readFully(1, byte[1]) [pos = 21203390] = 2,199,571,000 ns 2016-05-11 13:54:49,213 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe 2016-05-11 13:54:49,214 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:54:52,487 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of readFully(1, byte[256]) [pos = 21203390] = 3,272,746,000 ns 2016-05-11 13:54:52,487 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe 2016-05-11 13:54:52,488 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:54:55,092 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of readFully(260, byte[256]) [pos = 21203390] = 2,604,062,000 ns 2016-05-11 13:54:55,092 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe 2016-05-11 13:54:55,093 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:54:56,825 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of readFully(1024, byte[256]) [pos = 21203390] = 1,731,421,000 ns 2016-05-11 13:54:56,825 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe 2016-05-11 13:54:56,825 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:54:58,486 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of readFully(1536, byte[256]) [pos = 21203390] = 1,660,882,000 ns 2016-05-11 13:54:58,487 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe 2016-05-11 13:54:58,487 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:55:00,635 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of readFully(8192, byte[1024]) [pos = 21203390] = 2,147,589,000 ns 2016-05-11 13:55:00,635 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe 2016-05-11 13:55:00,636 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:55:02,333 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of readFully(9728, byte[1024]) [pos = 21203390] = 1,697,169,000 ns 2016-05-11 13:55:02,334 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - org.apache.hadoop.fs.s3a.S3AInputStream@3d85fdbe 2016-05-11 13:55:02,334 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-05-11 13:55:02,334 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Du
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15279905#comment-15279905 ] Steve Loughran commented on HADOOP-13028: - Because one place I'm using this to look at the logs and see how to tune the performance is in spark code which doesn't have access to those internals and is built against Hadoop 2.6.x anyway. It lets me have code which can be run with -Dhadoop.version=2.7.1 and -Dhadoop.version=2.8.0-SNAPSHOT, I can not only measure the duration in the spark code itself, I can see the logged info and see what's been happening —where things can be improved futher. We cannot do this if the way to log this data is via a class which is package private and in Hadoop 2.8+ only. As requested, I've scoped that statistics class so that the only way to get at it is to inject code into the org.apache.hadoop.fs.s3a package. Do you really, really, want me to do that in spark code? And use introspection to get at a class it can't compile against. Please, give me the string: it'll be better for all of us. As and when your colleagues sit down to look at Parquet performance on S3, they'll appreciate it. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278800#comment-15278800 ] Colin Patrick McCabe commented on HADOOP-13028: --- bq. Patrick: regarding fs.s3a.readahead.range versus calling it fs.s3a.readahead.default, I think "default" could be a bit confusing too. How about I make it clear that the if setReadahead() is set, then it supercedes any previous value? Sure. bq. I absolutely need that printing in there, otherwise the value of this patch is significantly reduced. If you want me to add a line like "WARNING: UNSTABLE" or something to that string value, I'm happy to do so. Or the output is published in a way that is deliberately hard to parse by machine but which we humans can read. But without that information, we can't so easily tell which Perhaps I'm missing something, but why not just do this in {{S3AInstrumentation#InputStreamStatistics#toString}}? I don't see why this is "absolutely needed" in {{S3AInputStream#toString}}. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278473#comment-15278473 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 59s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 42s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 55s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 8s {color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 25s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s {color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 13s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 5s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 5s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 22s {color} | {color:red} root: The patch generated 40 new + 46 unchanged - 55 fixed = 86 total (was 101) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 50 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 29s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s {color} | {color:red} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 generated 1 new + 0 unchanged - 8 fixed = 1 total (was 8) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 29s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 0s {color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 13s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278242#comment-15278242 ] Steve Loughran commented on HADOOP-13028: - Patch 011 - address Chris's last comments and some of Patrick's. - The test {{TestS3ABlocksize}} is passing. - made clear in uses of {{S3AInputStream.closed}} what the concurrency expectations are. - I'm still publishing the stats in the {{toString()}} values, I've just highlit in the javadocs that they are unstable. and even added as an @ attribute. - emphasised in {{core-default.xml}} and {{aws/index.md}} that the readahead range field is overwritten by any {{setReadahead()}} value. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, HADOOP-13028-branch-2-011.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278168#comment-15278168 ] Steve Loughran commented on HADOOP-13028: - Patrick: regarding {{fs.s3a.readahead.range}} versus calling it {{fs.s3a.readahead.default}}, I think "default" could be a bit confusing too. How about I make it clear that the if {{setReadahead()}} is set, then it supercedes any previous value? > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278142#comment-15278142 ] Steve Loughran commented on HADOOP-13028: - # I'll comment in the close(). # we should add a compatibility statement to string values "no guarantees at all". There's one for token printing, HDFS-9732, where we've explicitly added a stable string value, {{toStringStable()}} so that a CLI command gets the same output as before —but that was for the specific case "output of a command line tool". Maybe we should standardise that method with an interface and a guarantee "this method doesn't change, provided libraries and the JDK doesn't change its output underneath" # as it stands, it's useful today as I've been looking at the printed logs in test runs downstream; no attempt to parse in software. Where it's invaluable here is: that downstream code doesn't need to be built exclusively against Hadoop 2.8+, or get access to an API we've agreed to hide. For example: SPARK-7481. I absolutely need that printing in there, otherwise the value of this patch is significantly reduced. If you want me to add a line like "WARNING: UNSTABLE" or something to that string value, I'm happy to do so. Or the output is published in a way that is deliberately hard to parse by machine but which we humans can read. But without that information, we can't so easily tell which If you do insist on that string being pulled, then I'm going to convert the statistics to being a globally accessible object instead, albeit tagged as @Unstable and LimitedPrivate("Testing"). > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276778#comment-15276778 ] Colin Patrick McCabe commented on HADOOP-13028: --- bq. I think this is OK. The whole close method is synchronized, so we won't have two threads concurrently doing the actual close. Almost all other accesses of closed are within synchronized methods too. It's marked volatile to help with one unsynchronized access from readFully, calling into checkNotClosed. That's only a read, not an update, so volatile is sufficient. Thanks for the explanation. I missed the interaction between {{synchronized}} and the assignment. Suggest adding a comment to the assignment in {{close()}} explaining why this is atomic, or simply using AtomicBoolean to future-proof this against later code changes. bq. I'd like to keep \[the toString changes\]. It's very convenient for logging. TestS3AInputStreamPerformance uses it for both logging output and detailed assertion messages. It's poor practice to rely on a Java object's toString output as a stable, parseable format. This is something that I'd like to see clarified in our compatibility documentation. The problem is, this is not consistent with how {{toString}} operates in other FS streams. We also don't have anything in our compatibility documentation stating that the output of {{toString}} is not a stable, parseable format. We've had many, many JIRAs to "make toString act like some previous behavior" for various Hadoop classes. I think we need to accept that currently the stream's {{toString}} method is viewed as a public, stable API whether we like it or not. How about just adding this information to the {{toString}} method of the stream statistics object? That makes more sense anyway. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274739#comment-15274739 ] Chris Nauroth commented on HADOOP-13028: bq. {{S3AInputStream#closed}}: it seems like this should be an {{AtomicBoolean}}. Otherwise two threads could both enter this code block, right? I think this is OK. The whole {{close}} method is {{synchronized}}, so we won't have two threads concurrently doing the actual close. Almost all other accesses of {{closed}} are within {{synchronized}} methods too. It's marked {{volatile}} to help with one unsynchronized access from {{readFully}}, calling into {{checkNotClosed}}. That's only a read, not an update, so {{volatile}} is sufficient. I don't object to using {{AtomicBoolean}}, but I don't think it's necessary. bq. Is it really necessary to put statistics information into the toString methods of the streams? I'd like to keep this. It's very convenient for logging. {{TestS3AInputStreamPerformance}} uses it for both logging output and detailed assertion messages. It's poor practice to rely on a Java object's {{toString}} output as a stable, parseable format. This is something that I'd like to see clarified in our compatibility documentation. I don't have a strong opinion on the naming questions. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274641#comment-15274641 ] Colin Patrick McCabe commented on HADOOP-13028: --- {code} 926 927 fs.s3a.readahead.range 928 65536 929 Bytes to read ahead during a seek() before closing and 930 re-opening the S3 HTTP connection. 931 {code} Hmm, should this be {{fs.s3a.readahead.default}}? It seems like this is the default if the user doesn't call {{FSDataInputStream#setReadahead}}, {{S3AInputStream#closed}}: it seems like this should be an {{AtomicBoolean}}. Otherwise two threads could both enter this code block, right? {code} 362 if (!closed) { 363 closed = true; 364 super.close(); 365 closeStream("close() operation", this.contentLength); 366 streamStatistics.close(); 367 } {code} {code} public S3AInstrumentation.InputStreamStatistics getStreamStatistics() { {code} Maybe should be called {{getS3StreamStatistics}}, reflecting the fact that this API is s3-specific? Is it really necessary to put statistics information into the {{toString}} methods of the streams? It seems like this could lead to compatibility woes, and we have the API described above to provide this information anyway. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274562#comment-15274562 ] Chris Nauroth commented on HADOOP-13028: Patch v010 addresses all of my prior feedback. It's fine to keep {{S3AInstrumentation#gauge}}. I agree that "must be private and have accessor methods" is pedantic. I think we can control this with the Checkstyle [HiddenField|http://checkstyle.sourceforge.net/config_coding.html#HiddenField] rule settings. Let's not bother for this patch though. You've already done a lot to clean up Checkstyle warnings. I agree with updating the JIRA title and closing down the other ones that were folded into this patch. I'm now seeing a failure in {{TestS3ABlocksize#testBlockSize}}: {code} testBlockSize(org.apache.hadoop.fs.s3a.TestS3ABlocksize) Time elapsed: 3.17 sec <<< FAILURE! java.lang.AssertionError: Double default block size in stat(): S3AFileStatus{path=s3a://cnauroth-test-aws-s3a/test/testBlockSize/file; isDirectory=false; length=1024; replication=1; blocksize=33554432; modification_time=1462559933000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} expected:<67108864> but was:<33554432> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.apache.hadoop.fs.s3a.TestS3ABlocksize.testBlockSize(TestS3ABlocksize.java:65) {code} It looks like the test has an expectation that you can update block size in the {{Configuration}} of an already-initialized {{S3AFileSystem}}, and it will start using the new value. After this patch, the block size is read from {{Configuration}} once and cached. I think the expectation of the test is a little dubious, but that's the behavior that shipped in 2.7.0. It's probably safest to remove this part of the patch for now. I just have one more request. Please remove the unused import of {{Configuration}} in {{S3AInstrumentation}}. I'll be +1 after that. We'll also need a separate trunk patch. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274197#comment-15274197 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 44s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 59s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 10s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 47s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 27s {color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 14s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 15s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 30s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 12s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 46s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 46s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 7s {color} | {color:red} root: The patch generated 21 new + 42 unchanged - 51 fixed = 63 total (was 93) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 50 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} hadoop-tools_hadoop-aws-jdk1.8.0_91 with JDK v1.8.0_91 generated 0 new + 0 unchanged - 8 fixed = 0 total (was 8) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 44s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 33s {color} | {color:green} hadoop-common in the
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274014#comment-15274014 ] Steve Loughran commented on HADOOP-13028: - patch 010, addressing chris's previous comments: 1. visibility: done.Added interface scope & stability attributes to all the S3a classes. Everything is marked as {{@Private}}; stability either {{@Evolving}} for subclasses of standard classes; {{@Unstable}} for all the metrics, and {{Stable}} for those classes which are used in the AWS toolkit. The {{org.apache.hadoop.fs.s3a.Constants}} is tagged as {{Stable, Evolving}} as it is intended for public use and must only add new attributes, not remove or change existing key names. 2. {{initMultipartUploads}} error ignored increment: done. I also renamed the operation {{errorIgnored()}} to give it more of a verb-style use. 3. Rename {{catch}} handlers. These are all 404 handling, as far as I can see, and really part of the legit workflow. What I have done is made sure those handlers log at debug. 4. done in both places. 5. done. Also looked at the others, added it everywhere, removed the {{LOG.isDebug}} wrappers for all the low-cost statements. 6. done 7. done. Also moved the blocksize get into a field; print that out too, fail init if the block is ever zero or less 8. OK: removed 9. done: setReadahead(null) -> default. 10. that came from the Azure stuff, but as its not used here (no looking at rolling window stuff) I've cut it. 11. If I could find a useful gauge, I would use it, so can we keep it in there until needed? Or cull it and re-implement as required? 12. done In {{copyFromLocalFile}} I do the {{statistics.incrementWriteOps(1)}} ahead of calling the transfer. That way, if the call failed, the counter still goes up. Note that in {{copyFile}} the counter is actually going up twice, as the progress listener incs it too. I think we should pull the {{incrementWriteOps}} call up ahead of {{transfers.copy}} and remove the one from the progress listener. maybe also in {{createEmptyObject}} Checkstyle is whining about the inner stats fields not having accessors; I'm ignoring that on a point of principle: it's being over fussy. To keep it happy I went and fixed up line width across much of {{S3AFilesystem}}. That'll keep the number down. Added validation of minimum values to all the various arguments (buffer/partition sizes, timeouts, etc), so that negative values -> {{IllegalArgumentException}}. Really we should add that option to {{Configuration}}, so that people don't forget to add those checks in their own code (and how much of Hadoop doesn't ...). Any (shipped) option where a too-low value is fixed to be a minimum is left alone for compatibility. Some of the options were checked downstream (e.g some of the socket options); now the exception raised includes the configuration key at fault. Fixed all javadoc errors. JDK8 fails on any error, so I've touched some files {{S3AFastOutputStream}}, {{S3Credentials}} {{org.apache.hadoop.fs.s3.FileSystemStore}} because of the javadoc problem. Note that the title of this JIRA is getting less accurate, more now: instrument S3a, implement readahead, validate input params and wrap up other issues. Those other JIRAs contained could be closed as fixed for explicitness. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > HADOOP-13028-branch-2-010.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273974#comment-15273974 ] Steve Loughran commented on HADOOP-13028: - sorry, missed all those. working on them and the various patch complaints > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273296#comment-15273296 ] Chris Nauroth commented on HADOOP-13028: [~ste...@apache.org], the changes in patch v009 look good to me. I think this is close to being complete. There was an earlier round of feedback from me that has not yet been addressed. This was small nitpicky stuff, nothing as tricky as the actual seek logic. Here is a direct link to that comment. https://issues.apache.org/jira/browse/HADOOP-13028?focusedCommentId=15267400&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15267400 > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, HADOOP-13028-branch-2-009.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272880#comment-15272880 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 13s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 37s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 36s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 25s {color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s {color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 14s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 27s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 4s {color} | {color:red} root: The patch generated 16 new + 49 unchanged - 40 fixed = 65 total (was 89) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 50 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 31s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s {color} | {color:red} hadoop-aws in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 14s {color} | {color:red} hadoop-tools_hadoop-aws-jdk1.7.0_95 with JDK v1.7.0_95 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 30s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 29s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 13s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.7.0_95
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272509#comment-15272509 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} | {color:red} HADOOP-13028 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12802441/HADOOP-13028-009.patch | | JIRA Issue | HADOOP-13028 | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/9291/console | | Powered by | Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, HADOOP-13028-009.patch, > HADOOP-13028-branch-2-008.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272259#comment-15272259 ] Steve Loughran commented on HADOOP-13028: - OK, so we can just get away with {{Long#MAX_VALUE}} as the len? That's a nice trick to consider in future *as it would avoid us having to know the length of the blob up front, which could potentially eliminate the up-front check until that first read (which is the lazy-open idea I've mentioned elsewhere) > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, > HADOOP-13028-branch-2-008.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271057#comment-15271057 ] Chris Nauroth commented on HADOOP-13028: bq. Really, we should be asking for the whole thing, shouldn't we? That's exactly what I was thinking. If we might later decide to keep reading forward, possibly to any arbitrary point, then there should be no need for a complex calculation of the endpoint. bq. I think the http content-range call does require you to specify a limit, so file-len is always required, but that can be enough It does seem to be required. The master branch of the AWS SDK has a new single-arg {{setRange}} method that just accepts the beginning point. This isn't available in our current dependency version. I see that the implementation just maps this to {{Long#MAX_VALUE}} as the endpoint. https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/model/GetObjectRequest.java#L426-L428 > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, > HADOOP-13028-branch-2-008.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270986#comment-15270986 ] Steve Loughran commented on HADOOP-13028: - 1. fixed 3. counting backwards seeks I'm just ignoring that value for now. left it in in case we ever wanted to start tracking these things. One interesting question about all seek + read stats, is really histograms of requests would be the best metric; not just the aggregates. 2. let me review that code. In fact, maybe I should factor it out for some independent checks. Really, we should be asking for the whole thing, shouldn't we? Because even irrespective of the amount you want in the current read() call, you don't want to have to re-open just because you didn't know the initial amount, do you? I think the http content-range call does require you to specify a limit, so file-len is always required, but that can be enough > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, > HADOOP-13028-branch-2-008.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267729#comment-15267729 ] Chris Nauroth commented on HADOOP-13028: [~ste...@apache.org], I've spent more time reading the seek code changes, and I'm pretty confident that they're correct overall, but I have a few more comments. # {{S3AInputStream#closeStream}} has the following log message. The text of the message indicates that it's logging {{contentLength}}, but really it's logging {{length}}. I imagine {{length}} is really the more interesting thing here, and the message text should be changed? {code} LOG.debug("Stream {} {}: {}; streamPos={}, nextReadPos={}," + " contentLength={}", uri, (shouldAbort ? "aborted":"closed"), reason, pos, nextReadPos, length); {code} # Actually, that makes me realize I am unclear about a change made in HADOOP-12444. {{S3AInputStream#reopen}} has a stream length calculation that gets passed into the range request. {code} requestedStreamLen = (length < 0) ? this.contentLength : Math.max(this.contentLength, (CLOSE_THRESHOLD + (targetPos + length))); ... GetObjectRequest request = new GetObjectRequest(bucket, key) .withRange(targetPos, requestedStreamLen); {code} Please tell me if I'm misunderstanding something, but I believe this calculation always results in an upper bound on the range that effectively means "get the whole thing." That {{Math.max}} call guarantees that the value is always at least {{contentLength}}, which is the whole file length. Is this a bug in the HADOOP-12444 patch? # {{InputStreamStatistics#seekBackwards}} accepts {{offset}} as an argument but doesn't use it. Is there supposed to be another counter for back-skipped bytes? At the call site within {{S3AInputStream#seekInStream}}, the value it passes would be negative, so we'd need to be careful of that. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, > HADOOP-13028-branch-2-008.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267400#comment-15267400 ] Chris Nauroth commented on HADOOP-13028: Hello [~ste...@apache.org]. I'm still digging into the changes in the seek code and the tests, but I'd like to share the feedback I have so far for patch v008. # Let's put visibility annotations on {{MetricsRecordBuilder}}. Public/Evolving for agreement with the {{MetricsRecordBuilder}} base class? # Should {{initMultipartUploads}} increment the ignored error counter? # {{rename}} has several {{catch}} blocks that don't propagate an exception. Should these increment the ignored error counter? # In the following exception message, I think we need an extra space before the "to". There are 2 different call sites that produce this message, so 2 spots to fix. {code} throw new InterruptedIOException("Interrupted copying " + src + "to " + dst + ", cancelling"); {code} # Can you please include {{src}} and {{dst}} in this log message from {{rename}}? {code} LOG.debug("rename: src or dst are empty"); {code} # Can you please include {{key}} in this log message from {{delete}}? {code} LOG.debug("Deleting fake empty directory"); {code} # Should {{S3AFileSystem#toString}} also include {{maxKeys}}, {{cannedACL}} and {{readAhead}}? # In {{S3AInputStream#reopen}}, is the following log message redundant, considering the call to {{closeStream}} will do its own logging? {code} LOG.debug("Closing the previous stream"); closeStream("reopen(" + reason + ")", requestedStreamLen); {code} # {{S3AInputStream#setReadahead}} doesn't exactly match the specification defined in {{CanSetReadahead}}. The interface says that {{null}} means to use the default, but the implementation here rejects {{null}}. This could be problematic for more complex use cases, such as someone wanting to programmatically control the amount of readahead. If they called {{setReadahead}} with a custom value, then I think ideally we should allow them to call it with {{null}} later, and restore back to the default from configuration. (I admit this is an edge case, but a {{DFSInputStream}} does allow this behavior.) # {{S3AInstrumentation}} receives a {{Configuration}} in its constructor but doesn't use it. Can it be removed? # {{S3AInstrumentation#gauge}} appears to be unused. # {{InputStreamStatistics#toString}} does not include {{readFullyOperations}}. It looks like there are some CheckStyle and JavaDoc things to follow up on from that last pre-commit run. The test failure is unrelated. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, > HADOOP-13028-branch-2-008.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265911#comment-15265911 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 26s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 4m 44s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 48s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 12s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 5s {color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 8s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 21s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s {color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s {color} | {color:green} branch-2 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s {color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 7s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 2s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 2s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 2s {color} | {color:red} root: The patch generated 15 new + 49 unchanged - 40 fixed = 64 total (was 89) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 50 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 31s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s {color} | {color:red} hadoop-aws in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 14s {color} | {color:red} hadoop-tools_hadoop-aws-jdk1.7.0_95 with JDK v1.7.0_95 generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 46s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 28s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 14s {color} | {color:green} hadoop-aws in the patch passed with JDK v1
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265815#comment-15265815 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} | {color:red} HADOOP-13028 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12801655/HADOOP-13028-008.patch | | JIRA Issue | HADOOP-13028 | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/9243/console | | Powered by | Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, HADOOP-13028-008.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265742#comment-15265742 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 0s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 39s {color} | {color:green} trunk passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 35s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s {color} | {color:green} trunk passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 36s {color} | {color:green} the patch passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 36s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 36s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 7s {color} | {color:red} root: The patch generated 32 new + 53 unchanged - 4 fixed = 85 total (was 57) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 46s {color} | {color:red} hadoop-tools/hadoop-aws generated 15 new + 0 unchanged - 0 fixed = 15 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s {color} | {color:red} hadoop-aws in the patch failed with JDK v1.8.0_92. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 15s {color} | {color:red} hadoop-tools_hadoop-aws-jdk1.7.0_95 with JDK v1.7.0_95 generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 16m 51s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_92. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 15s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_92. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 24s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 14s {color} | {color:green} hadoop-aws in the
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265701#comment-15265701 ] Steve Loughran commented on HADOOP-13028: - +there's another contained patch, HADOOP-13058 , about enabling multipart upload failing against R/O buckets. This was blocking test runs against a read only AWS dataset, fixed by catching and downgrading the fault to a warn. You can't upload to a read only bucket anyway, after all > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > HADOOP-13028-007.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265416#comment-15265416 ] Steve Loughran commented on HADOOP-13028: - Chris, I'm going to submit a patch with the logging enhanced. It's not going to cover robustness of removeKeys as that turns out be complicated enough to merit its own JIRA (HADOOP-12844), it's own patch, review *and real test*. in particular, as a way to create the problem is clearly "delete one of the keys", we'll need the deletion code to not only skip over the failure of individual files, —but to not treat the absence of a file as an error. That means a test to create the failure condition on both single and multidelete, see what comes back and then add the code to handle it (if there isn't enough information in the exception error code, that means probing for each key's existence). Then the method needs to handle the situation "other failures", which could be handled by rethrowing the exception (or one of the exceptions, for single-file-delete failures, or otherwise reporting the failures to the caller (e.g: return a list of nondeleted files). And that caller code needs to decide what to do, which could vary between {{rename()}} and {{delete()}} > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265383#comment-15265383 ] Steve Loughran commented on HADOOP-13028: - regarding the contained patches, the IOE handling patch HADOOP-12844, is a direct precursor, I just optimised the implementation by moving the exiting handler for socket exceptions to after the EOF handler, and expanded the check to all IOEs. You can look at the patch there and think "would that work?" We don't have any test checking this failure path (who fancies writing some fault injection mocking?), so a review matters there. The forward seek buffering code is very different; this is the code to consider. it does a lot of thinking about how far to seek # if the forward length is in the {{available()}} range, that is already received, *always read forwards*. That's irrespective of requested range. # otherwise, min of (bytes-remaining, buffer size) # with counters of times of forward/backward seeks, and how many bytes were skipped during forward seeks # there are tests So: review this code directly. I'll look at the logging and remove keys code. There's already an open JIRA on failures of deletes after a rename, which I was hoping to have addressed elsewhere. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263762#comment-15263762 ] Steve Loughran commented on HADOOP-13028: - I'm only looking at input, initially > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263297#comment-15263297 ] Chris Nauroth commented on HADOOP-13028: [~ste...@apache.org], also, any instrumentation planned on the output stream side, or do you want to keep scope focused on the input stream side here? > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263292#comment-15263292 ] Chris Nauroth commented on HADOOP-13028: Hello [~ste...@apache.org]. This looks very useful overall. I'm a bit confused, because it seems different iterations of the patch have folded in fixes from other JIRAs. Can you please clarify for reviewers if we should be reviewing other patches first? Since the patch is touching some {{LOG.debug}} statements, would it be helpful to include {{src}} and {{dst}} in those log message? {{S3AFileSystem#removeKeys}} appears to have some subtle bugs. This is not entirely related to your patch. The multi-delete might fail with some objects successfully deleted but others remaining. However, the stats only increment if the whole multi-delete succeeded. http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Client.html#deleteObjects(com.amazonaws.services.s3.model.DeleteObjectsRequest) http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/MultiObjectDeleteException.html Similarly, if multi-delete is disabled, then any individual delete in the loop might throw an exception and skip the stats increments. I'll wait for clarification on the question on pre-requisite patches before I take this for a test run myself. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261354#comment-15261354 ] Colin Patrick McCabe commented on HADOOP-13028: --- Thanks, [~steve_l]. I withdraw my -1, provided we don't add any new public APIs in this patch. I'm out tomorrow and Friday but hopefully I'll have a chance to review it next week (if someone doesn't review it first). > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15260128#comment-15260128 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 5m 6s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 2s {color} | {color:green} trunk passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 36s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} trunk passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 53s {color} | {color:green} the patch passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 40s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 40s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 5s {color} | {color:red} root: The patch generated 34 new + 55 unchanged - 2 fixed = 89 total (was 57) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 44s {color} | {color:red} hadoop-tools/hadoop-aws generated 15 new + 0 unchanged - 0 fixed = 15 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 12s {color} | {color:red} hadoop-aws in the patch failed with JDK v1.8.0_92. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 15s {color} | {color:red} hadoop-tools_hadoop-aws-jdk1.7.0_95 with JDK v1.7.0_95 generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 39s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_92. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 12s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_92. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 50s {color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 14s {color} | {color:green}
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15260052#comment-15260052 ] Steve Loughran commented on HADOOP-13028: - that was {{fs.list("/")}} before JIRA decides that ( + / + ) was a tick mark > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259943#comment-15259943 ] Steve Loughran commented on HADOOP-13028: - Noted. In the meantime, if you have colleagues who work on S3 related deployments, perhaps they could have a look and review the patch overall, that is, irrespective of API: does it work > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258713#comment-15258713 ] Colin Patrick McCabe commented on HADOOP-13028: --- It looks really good, [~steve_l]. Just to avoid misunderstandings, I'll drop a -1 here until we finish discussing what the interface should be... I look forward to giving this a review as soon as we figure that out. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258651#comment-15258651 ] Steve Loughran commented on HADOOP-13028: - Note the actual fix is to force a {{fs.list(/)}} after creating the FS; the failure of the first operation is viewed as transient and downgraded to a warning. This is an interesting problem: we really only want to swallow transient network failures, not other issues. But any other issues will surface the next time someone tries to use the instance; by moving the checks out of the {{initialize()}} method we stop the FS setup itself breaking. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, HADOOP-13028-005.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258265#comment-15258265 ] Steve Loughran commented on HADOOP-13028: - This patch (the HADOOP-13059 robust init) bit breaks the proxy tests in {{TestS3AConfiguration}}; looks like those tests failed because the bucket check triggered a failure on proxy invocation > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257830#comment-15257830 ] Steve Loughran commented on HADOOP-13028: - checkstyle are all about code going ++ on volatiles. The InputStream API says "single thread only", and while we know HBase ignores that, we also know that HBase cannot ever work on S3, and also that these are just little counters, nothing critical...if someone does break the threading rules, well, the counters will end up inaccurate. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257096#comment-15257096 ] Hadoop QA commented on HADOOP-13028: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 1s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 6s {color} | {color:green} trunk passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 49s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s {color} | {color:green} trunk passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 2s {color} | {color:green} the patch passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 54s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 54s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 3s {color} | {color:red} root: patch generated 44 new + 53 unchanged - 2 fixed = 97 total (was 55) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 44s {color} | {color:red} hadoop-tools/hadoop-aws generated 12 new + 0 unchanged - 0 fixed = 12 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 13s {color} | {color:red} hadoop-aws in the patch failed with JDK v1.8.0_92. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 3m 50s {color} | {color:red} hadoop-tools_hadoop-aws-jdk1.7.0_95 with JDK v1.7.0_95 generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 56s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_92. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 14s {color} | {color:green} hadoop-aws in the patch passed with JDK v1.8.0_92. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 10s {color} | {color:red} hadoop-common in the patch failed with
[jira] [Commented] (HADOOP-13028) add low level counter metrics for S3A; use in read performance tests
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256884#comment-15256884 ] Steve Loughran commented on HADOOP-13028: - Patch -004 includes HADOOP-13047; forward read range is configurable. Default is 64K; we'll need tests to work out what is good in different deployments (in-EC2; remote). For my tests, 640K looks right. There's a lot of tests for the seek behaviour; seeks with no read to verify lazy seek, then some seek+read sequences to see how things slow down on different readahead values. BTW, the readahead can be set on an open stream via {{CanSetReadahead.setReadahead(Long)}}; this could enable some code to dynamically tune things if it really knew what it was doing. I'm using it in the tests to simplify their setup. > add low level counter metrics for S3A; use in read performance tests > > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, > HADOOP-13028-004.patch, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, > org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)