[jira] [Commented] (HADOOP-9613) [JDK8] Update jersey version to latest 1.x release
[ https://issues.apache.org/jira/browse/HADOOP-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255129#comment-15255129 ] Roman Shaposhnik commented on HADOOP-9613: -- Now that I'm building Hadoop ecosystem for ARM and OpenJDK 7 is totally busted on that platform, I'd like to add yet one more of "me too" to this JIRA. Is there any way I can help getting this done (along with other 3) ? > [JDK8] Update jersey version to latest 1.x release > -- > > Key: HADOOP-9613 > URL: https://issues.apache.org/jira/browse/HADOOP-9613 > Project: Hadoop Common > Issue Type: Sub-task > Components: build >Affects Versions: 2.4.0, 3.0.0 >Reporter: Timothy St. Clair >Assignee: Tsuyoshi Ozawa > Labels: maven > Attachments: HADOOP-2.2.0-9613.patch, > HADOOP-9613.004.incompatible.patch, HADOOP-9613.005.incompatible.patch, > HADOOP-9613.006.incompatible.patch, HADOOP-9613.007.incompatible.patch, > HADOOP-9613.008.incompatible.patch, HADOOP-9613.009.incompatible.patch, > HADOOP-9613.010.incompatible.patch, HADOOP-9613.011.incompatible.patch, > HADOOP-9613.012.incompatible.patch, HADOOP-9613.013.incompatible.patch, > HADOOP-9613.1.patch, HADOOP-9613.2.patch, HADOOP-9613.3.patch, > HADOOP-9613.patch > > > Update pom.xml dependencies exposed when running a mvn-rpmbuild against > system dependencies on Fedora 18. > The existing version is 1.8 which is quite old. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13044) Amazon S3 library 10.10.60+ (JDK8u60+) depends on http components 4.3
[ https://issues.apache.org/jira/browse/HADOOP-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255089#comment-15255089 ] Kai Sasaki commented on HADOOP-13044: - Yes, as you said. 10.10.6 depends on [httpclient:4.3.6|https://github.com/aws/aws-sdk-java/blob/cb4fa451e98099dfeac6f804eb72e215c04bd240/aws-java-sdk-core/pom.xml#L24-L28], but 4.2.5 is actually used since it is specified by Hadoop POM. This can cause anyway same issue though it's not surfaced yet. > Amazon S3 library 10.10.60+ (JDK8u60+) depends on http components 4.3 > - > > Key: HADOOP-13044 > URL: https://issues.apache.org/jira/browse/HADOOP-13044 > Project: Hadoop Common > Issue Type: Improvement > Components: build, fs/s3 >Affects Versions: 2.8.0 > Environment: JDK 8u60 >Reporter: Kai Sasaki >Assignee: Kai Sasaki > Attachments: HADOOP-13044.01.patch > > > In case of using AWS SDK in the classpath of hadoop, we faced an issue caused > by incompatiblity of AWS SDK and httpcomponents. > {code} > java.lang.NoSuchFieldError: INSTANCE > at > com.amazonaws.http.conn.SdkConnectionKeepAliveStrategy.getKeepAliveDuration(SdkConnectionKeepAliveStrategy.java:48) > at > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:535) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) > {code} > The latest AWS SDK depends on 4.3.x which has > [DefaultConnectionKeepAliveStrategy.INSTANCE|http://hc.apache.org/httpcomponents-client-4.3.x/httpclient/apidocs/org/apache/http/impl/client/DefaultConnectionKeepAliveStrategy.html#INSTANCE]. > This field is introduced from 4.3. > This will allow us to avoid {{CLASSPATH}} confliction around httpclient > versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12563) Updated utility to create/modify token files
[ https://issues.apache.org/jira/browse/HADOOP-12563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255086#comment-15255086 ] Bibin A Chundatt commented on HADOOP-12563: --- [~mattpaduano] # Could you please add testcases for empty maps also in new patch. # Testcases doesnt seem to run in windows could you check that too. > Updated utility to create/modify token files > > > Key: HADOOP-12563 > URL: https://issues.apache.org/jira/browse/HADOOP-12563 > Project: Hadoop Common > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Matthew Paduano > Fix For: 3.0.0 > > Attachments: HADOOP-12563.01.patch, HADOOP-12563.02.patch, > HADOOP-12563.03.patch, HADOOP-12563.04.patch, HADOOP-12563.05.patch, > HADOOP-12563.06.patch, HADOOP-12563.07.patch, HADOOP-12563.07.patch, > HADOOP-12563.08.patch, HADOOP-12563.09.patch, HADOOP-12563.10.patch, > HADOOP-12563.11.patch, HADOOP-12563.12.patch, HADOOP-12563.13.patch, > HADOOP-12563.14.patch, dtutil-test-out, > example_dtutil_commands_and_output.txt, generalized_token_case.pdf > > > hdfs fetchdt is missing some critical features and is geared almost > exclusively towards HDFS operations. Additionally, the token files that are > created use Java serializations which are hard/impossible to deal with in > other languages. It should be replaced with a better utility in common that > can read/write protobuf-based token files, has enough flexibility to be used > with other services, and offers key functionality such as append and rename. > The old version file format should still be supported for backward > compatibility, but will be effectively deprecated. > A follow-on JIRA will deprecrate fetchdt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-13056) Print expected values when rejecting a server's determined principal
Harsh J created HADOOP-13056: Summary: Print expected values when rejecting a server's determined principal Key: HADOOP-13056 URL: https://issues.apache.org/jira/browse/HADOOP-13056 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.5.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial When an address-constructed service principal by a client does not match a provided pattern or the configured principal property, the error is very uninformative on what the specific cause is. Currently the only error printed is, in both cases: {code} java.lang.IllegalArgumentException: Server has invalid Kerberos principal: hdfs/host.internal@REALM {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters
[ https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255017#comment-15255017 ] Harsh J commented on HADOOP-12436: -- This change subtly fixes the issue described in HADOOP-13051 (test-case added there for regression's sake) > GlobPattern regex library has performance issues with wildcard characters > - > > Key: HADOOP-12436 > URL: https://issues.apache.org/jira/browse/HADOOP-12436 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 2.2.0, 2.7.1 >Reporter: Matthew Paduano >Assignee: Matthew Paduano > Fix For: 3.0.0 > > Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, > HADOOP-12436.03.patch, HADOOP-12436.04.patch, HADOOP-12436.05.patch > > > java.util.regex classes have performance problems with certain wildcard > patterns. Namely, consecutive * characters in a file name (not properly > escaped as literals) will cause commands such as "hadoop fs -ls > file**name" to consume 100% CPU and probably never return in a reasonable > time (time scales with number of *'s). > Here is an example: > {noformat} > hadoop fs -touchz > /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist > hadoop fs -ls > /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist > {noformat} > causes: > {noformat} > PIDCOMMAND %CPU TIME > 14526 java 100.0 01:18.85 > {noformat} > Not every string of *'s causes this, but the above filename reproduces this > reliably. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13045) hadoop_add_classpath is not working in .hadooprc
[ https://issues.apache.org/jira/browse/HADOOP-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254933#comment-15254933 ] Hadoop QA commented on HADOOP-13045: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 4s {color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 9s {color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 33s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 34s {color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 12m 32s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12800302/HADOOP-13045.00.patch | | JIRA Issue | HADOOP-13045 | | Optional Tests | asflicense mvnsite unit shellcheck shelldocs | | uname | Linux ead8053159ee 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 63e5412 | | shellcheck | v0.4.3 | | JDK v1.7.0_95 Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/9155/testReport/ | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/9155/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > hadoop_add_classpath is not working in .hadooprc > > > Key: HADOOP-13045 > URL: https://issues.apache.org/jira/browse/HADOOP-13045 > Project: Hadoop Common > Issue Type: Bug >Reporter: Akira AJISAKA >Assignee: Allen Wittenauer > Attachments: HADOOP-13045.00.patch > > > {{hadoop_basic_function}} resets {{CLASSPATH}} after {{.hadooprc}} is called. > {noformat} > $ hadoop --debug version > (snip) > DEBUG: Applying the user's .hadooprc > DEBUG: Initial CLASSPATH=/root/hadoop-tools-0.1-SNAPSHOT.jar > DEBUG: Initialize CLASSPATH > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/build/native > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/lib/native > DEBUG: Initial CLASSPATH=/usr/local/hadoop/share/hadoop/common/lib/* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13039) Add documentation for configuration property ipc.maximum.data.length for controlling maximum RPC message size.
[ https://issues.apache.org/jira/browse/HADOOP-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254924#comment-15254924 ] Arpit Agarwal commented on HADOOP-13039: +1 pending Jenkins, thanks [~liuml07]. > Add documentation for configuration property ipc.maximum.data.length for > controlling maximum RPC message size. > -- > > Key: HADOOP-13039 > URL: https://issues.apache.org/jira/browse/HADOOP-13039 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation >Reporter: Chris Nauroth >Assignee: Mingliang Liu > Attachments: HADOOP-13039.000.patch, HADOOP-13039.001.patch > > > The RPC server enforces a maximum length on incoming messages. Messages > larger than the maximum are rejected immediately as potentially malicious. > The maximum length can be tuned by setting configuration property > {{ipc.maximum.data.length}}, but this is not documented in core-site.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13039) Add documentation for configuration property ipc.maximum.data.length for controlling maximum RPC message size.
[ https://issues.apache.org/jira/browse/HADOOP-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HADOOP-13039: --- Attachment: HADOOP-13039.001.patch Per-offline discussion with [~arpitagarwal], the v1 patch refined the documentation to make it clearer. > Add documentation for configuration property ipc.maximum.data.length for > controlling maximum RPC message size. > -- > > Key: HADOOP-13039 > URL: https://issues.apache.org/jira/browse/HADOOP-13039 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation >Reporter: Chris Nauroth >Assignee: Mingliang Liu > Attachments: HADOOP-13039.000.patch, HADOOP-13039.001.patch > > > The RPC server enforces a maximum length on incoming messages. Messages > larger than the maximum are rejected immediately as potentially malicious. > The maximum length can be tuned by setting configuration property > {{ipc.maximum.data.length}}, but this is not documented in core-site.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13039) Add documentation for configuration property ipc.maximum.data.length for controlling maximum RPC message size.
[ https://issues.apache.org/jira/browse/HADOOP-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254904#comment-15254904 ] Arpit Agarwal commented on HADOOP-13039: Hi [~liuml07], I think we can remove this string "as potentially malicious". The patch looks good otherwise. Thanks! > Add documentation for configuration property ipc.maximum.data.length for > controlling maximum RPC message size. > -- > > Key: HADOOP-13039 > URL: https://issues.apache.org/jira/browse/HADOOP-13039 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation >Reporter: Chris Nauroth >Assignee: Mingliang Liu > Attachments: HADOOP-13039.000.patch > > > The RPC server enforces a maximum length on incoming messages. Messages > larger than the maximum are rejected immediately as potentially malicious. > The maximum length can be tuned by setting configuration property > {{ipc.maximum.data.length}}, but this is not documented in core-site.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13053) FS Shell should use File system API, not FileContext
[ https://issues.apache.org/jira/browse/HADOOP-13053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254785#comment-15254785 ] Hadoop QA commented on HADOOP-13053: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 43s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 33s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 50s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 38s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 38s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s {color} | {color:red} hadoop-common-project/hadoop-common: patch generated 1 new + 50 unchanged - 2 fixed = 51 total (was 52) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 42s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 50s {color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 59m 20s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12800237/HADOOP-13053.001.patch | | JIRA Issue | HADOOP-13053 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 4495dde70191 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool |
[jira] [Commented] (HADOOP-13033) Add missing Javadoc enries to Interns.java
[ https://issues.apache.org/jira/browse/HADOOP-13033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254764#comment-15254764 ] Hudson commented on HADOOP-13033: - FAILURE: Integrated in Hadoop-trunk-Commit #9658 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9658/]) HADOOP-13033. Add missing Javadoc entries to Interns.java. Contributed (aajisaka: rev c610031cabceebe9fe63106471476be862d6013c) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/Interns.java > Add missing Javadoc enries to Interns.java > -- > > Key: HADOOP-13033 > URL: https://issues.apache.org/jira/browse/HADOOP-13033 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Reporter: Andras Bokor >Assignee: Andras Bokor >Priority: Minor > Labels: newbie > Fix For: 2.8.0 > > Attachments: HADOOP-13033.01.patch > > > Interns.java:info(String name, String description) method misses the > description of parameters. With adding the missing entries the will have no > warnings at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13033) Add missing Javadoc enries to Interns.java
[ https://issues.apache.org/jira/browse/HADOOP-13033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HADOOP-13033: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed this to trunk, branch-2, and branch-2.8. Thanks [~boky01] for cleaning up the source code. > Add missing Javadoc enries to Interns.java > -- > > Key: HADOOP-13033 > URL: https://issues.apache.org/jira/browse/HADOOP-13033 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Reporter: Andras Bokor >Assignee: Andras Bokor >Priority: Minor > Labels: newbie > Fix For: 2.8.0 > > Attachments: HADOOP-13033.01.patch > > > Interns.java:info(String name, String description) method misses the > description of parameters. With adding the missing entries the will have no > warnings at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13033) Add missing Javadoc enries to Interns.java
[ https://issues.apache.org/jira/browse/HADOOP-13033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254722#comment-15254722 ] Akira AJISAKA commented on HADOOP-13033: +1, committing this. > Add missing Javadoc enries to Interns.java > -- > > Key: HADOOP-13033 > URL: https://issues.apache.org/jira/browse/HADOOP-13033 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Reporter: Andras Bokor >Assignee: Andras Bokor >Priority: Minor > Labels: newbie > Fix For: 2.8.0 > > Attachments: HADOOP-13033.01.patch > > > Interns.java:info(String name, String description) method misses the > description of parameters. With adding the missing entries the will have no > warnings at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12911) Upgrade Hadoop MiniKDC with Kerby
[ https://issues.apache.org/jira/browse/HADOOP-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254720#comment-15254720 ] Kai Zheng commented on HADOOP-12911: Thanks Jiajia for looking for the impacts, it sounds good so we probably don't need to mark this as incompatible, as most downstream projects use MiniKDC in the most simpler approach for their tests. > Upgrade Hadoop MiniKDC with Kerby > - > > Key: HADOOP-12911 > URL: https://issues.apache.org/jira/browse/HADOOP-12911 > Project: Hadoop Common > Issue Type: Improvement > Components: test >Reporter: Jiajia Li >Assignee: Jiajia Li > Attachments: HADOOP-12911-v1.patch, HADOOP-12911-v2.patch, > HADOOP-12911-v3.patch, HADOOP-12911-v4.patch, HADOOP-12911-v5.patch, > HADOOP-12911-v6.patch > > > As discussed in the mailing list, we’d like to introduce Apache Kerby into > Hadoop. Initially it’s good to start with upgrading Hadoop MiniKDC with Kerby > offerings. Apache Kerby (https://github.com/apache/directory-kerby), as an > Apache Directory sub project, is a Java Kerberos binding. It provides a > SimpleKDC server that borrowed ideas from MiniKDC and implemented all the > facilities existing in MiniKDC. Currently MiniKDC depends on the old Kerberos > implementation in Directory Server project, but the implementation is stopped > being maintained. Directory community has a plan to replace the > implementation using Kerby. MiniKDC can use Kerby SimpleKDC directly to avoid > depending on the full of Directory project. Kerby also provides nice identity > backends such as the lightweight memory based one and the very simple json > one for easy development and test environments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13045) hadoop_add_classpath is not working in .hadooprc
[ https://issues.apache.org/jira/browse/HADOOP-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-13045: -- Hadoop Flags: Incompatible change Status: Patch Available (was: Open) > hadoop_add_classpath is not working in .hadooprc > > > Key: HADOOP-13045 > URL: https://issues.apache.org/jira/browse/HADOOP-13045 > Project: Hadoop Common > Issue Type: Bug >Reporter: Akira AJISAKA >Assignee: Allen Wittenauer > Attachments: HADOOP-13045.00.patch > > > {{hadoop_basic_function}} resets {{CLASSPATH}} after {{.hadooprc}} is called. > {noformat} > $ hadoop --debug version > (snip) > DEBUG: Applying the user's .hadooprc > DEBUG: Initial CLASSPATH=/root/hadoop-tools-0.1-SNAPSHOT.jar > DEBUG: Initialize CLASSPATH > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/build/native > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/lib/native > DEBUG: Initial CLASSPATH=/usr/local/hadoop/share/hadoop/common/lib/* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13045) hadoop_add_classpath is not working in .hadooprc
[ https://issues.apache.org/jira/browse/HADOOP-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-13045: -- Attachment: HADOOP-13045.00.patch -00: * quickie patch We'll need to go back through the release notes, etc, I think. > hadoop_add_classpath is not working in .hadooprc > > > Key: HADOOP-13045 > URL: https://issues.apache.org/jira/browse/HADOOP-13045 > Project: Hadoop Common > Issue Type: Bug >Reporter: Akira AJISAKA > Attachments: HADOOP-13045.00.patch > > > {{hadoop_basic_function}} resets {{CLASSPATH}} after {{.hadooprc}} is called. > {noformat} > $ hadoop --debug version > (snip) > DEBUG: Applying the user's .hadooprc > DEBUG: Initial CLASSPATH=/root/hadoop-tools-0.1-SNAPSHOT.jar > DEBUG: Initialize CLASSPATH > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/build/native > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/lib/native > DEBUG: Initial CLASSPATH=/usr/local/hadoop/share/hadoop/common/lib/* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HADOOP-13045) hadoop_add_classpath is not working in .hadooprc
[ https://issues.apache.org/jira/browse/HADOOP-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer reassigned HADOOP-13045: - Assignee: Allen Wittenauer > hadoop_add_classpath is not working in .hadooprc > > > Key: HADOOP-13045 > URL: https://issues.apache.org/jira/browse/HADOOP-13045 > Project: Hadoop Common > Issue Type: Bug >Reporter: Akira AJISAKA >Assignee: Allen Wittenauer > Attachments: HADOOP-13045.00.patch > > > {{hadoop_basic_function}} resets {{CLASSPATH}} after {{.hadooprc}} is called. > {noformat} > $ hadoop --debug version > (snip) > DEBUG: Applying the user's .hadooprc > DEBUG: Initial CLASSPATH=/root/hadoop-tools-0.1-SNAPSHOT.jar > DEBUG: Initialize CLASSPATH > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/build/native > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/lib/native > DEBUG: Initial CLASSPATH=/usr/local/hadoop/share/hadoop/common/lib/* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10768) Optimize Hadoop RPC encryption performance
[ https://issues.apache.org/jira/browse/HADOOP-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254646#comment-15254646 ] Kai Zheng commented on HADOOP-10768: This actually bypasses the low efficient {{SASL.wrap/unwrap}} operations by providing an extra Hadoop layer above, it should be mostly flexible for Hadoop. A further consideration is how to make the layer look good and also available for the ecosystem since other projects like HBase doesn't use Hadoop IPC. Any thoughts? > Optimize Hadoop RPC encryption performance > -- > > Key: HADOOP-10768 > URL: https://issues.apache.org/jira/browse/HADOOP-10768 > Project: Hadoop Common > Issue Type: Improvement > Components: performance, security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Dian Fu > Attachments: HADOOP-10768.001.patch, HADOOP-10768.002.patch, Optimize > Hadoop RPC encryption performance.pdf > > > Hadoop RPC encryption is enabled by setting {{hadoop.rpc.protection}} to > "privacy". It utilized SASL {{GSSAPI}} and {{DIGEST-MD5}} mechanisms for > secure authentication and data protection. Even {{GSSAPI}} supports using > AES, but without AES-NI support by default, so the encryption is slow and > will become bottleneck. > After discuss with [~atm], [~tucu00] and [~umamaheswararao], we can do the > same optimization as in HDFS-6606. Use AES-NI with more than *20x* speedup. > On the other hand, RPC message is small, but RPC is frequent and there may be > lots of RPC calls in one connection, we needs to setup benchmark to see real > improvement and then make a trade-off. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13052) ChecksumFileSystem mishandles crc file permissions
[ https://issues.apache.org/jira/browse/HADOOP-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254635#comment-15254635 ] Hudson commented on HADOOP-13052: - FAILURE: Integrated in Hadoop-trunk-Commit #9657 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9657/]) HADOOP-13052. ChecksumFileSystem mishandles crc file permissions. (kihwal: rev 9dbdc8e12d009e76635b2d20ce940851725cb069) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestChecksumFileSystem.java > ChecksumFileSystem mishandles crc file permissions > -- > > Key: HADOOP-13052 > URL: https://issues.apache.org/jira/browse/HADOOP-13052 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 2.7.3 > > Attachments: HADOOP-13052.patch > > > CheckFileSystem does not override permission related calls to apply those > operations to the hidden crc files. Clients may be unable to read the crcs > if the file is created with strict permissions and then relaxed. > The checksum fs is designed to work with or w/o crcs present, so it silently > ignores FNF exceptions. The java file stream apis unfortunately may only > throw FNF, so permission denied becomes FNF resulting in this bug going > silently unnoticed. > (Problem discovered via public localizer. Files are downloaded as > user-readonly and then relaxed to all-read. The crc remains user-readonly) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10768) Optimize Hadoop RPC encryption performance
[ https://issues.apache.org/jira/browse/HADOOP-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254603#comment-15254603 ] Kai Zheng commented on HADOOP-10768: Thanks for the design doc and clarifying. It looks good work, [~dian.fu]! Comments about the doc: * It would be good to clearly say: this builds application layer data encryption *ABOVE* SASL (not mixed or not in the same layer of SASL). So accordingly, you can simplify your flow picture very much, by reducing it into only two steps: 1) SASL handshake; 2) Hadoop data encryption cipher negotiation. The illustrated 7 steps for SASL may be specific to GSSAPI, for others it may be much simpler, anyhow we don't need to show it here. * Why need to have {{SaslCryptoCodec}}? What it does? Maybe after separate encryption negotiation is complete, we can create CryptoOutputStream directly? * Since we're going in the same approach with data transfer encryption, both doing separate encryption cipher negotiation and data encryption after and above SASL, one being for file data, the other for RPC data, maybe we can mostly reuse the existing work? Did we go this way in implementation or is there any difference? * How the encryption key(s) is negotiated or determined? Do it consider the established session key from SASL if available? It seems to produce a key pair and how the two keys are used? * Do we hard-code the AES cipher to be AES/CTR mode? Guess other mode like AES/GCM can also be used. > Optimize Hadoop RPC encryption performance > -- > > Key: HADOOP-10768 > URL: https://issues.apache.org/jira/browse/HADOOP-10768 > Project: Hadoop Common > Issue Type: Improvement > Components: performance, security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Dian Fu > Attachments: HADOOP-10768.001.patch, HADOOP-10768.002.patch, Optimize > Hadoop RPC encryption performance.pdf > > > Hadoop RPC encryption is enabled by setting {{hadoop.rpc.protection}} to > "privacy". It utilized SASL {{GSSAPI}} and {{DIGEST-MD5}} mechanisms for > secure authentication and data protection. Even {{GSSAPI}} supports using > AES, but without AES-NI support by default, so the encryption is slow and > will become bottleneck. > After discuss with [~atm], [~tucu00] and [~umamaheswararao], we can do the > same optimization as in HDFS-6606. Use AES-NI with more than *20x* speedup. > On the other hand, RPC message is small, but RPC is frequent and there may be > lots of RPC calls in one connection, we needs to setup benchmark to see real > improvement and then make a trade-off. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13052) ChecksumFileSystem mishandles crc file permissions
[ https://issues.apache.org/jira/browse/HADOOP-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HADOOP-13052: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.3 Status: Resolved (was: Patch Available) Committed to trunk through 2.7. Thanks for fixing this Daryn. > ChecksumFileSystem mishandles crc file permissions > -- > > Key: HADOOP-13052 > URL: https://issues.apache.org/jira/browse/HADOOP-13052 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 2.7.3 > > Attachments: HADOOP-13052.patch > > > CheckFileSystem does not override permission related calls to apply those > operations to the hidden crc files. Clients may be unable to read the crcs > if the file is created with strict permissions and then relaxed. > The checksum fs is designed to work with or w/o crcs present, so it silently > ignores FNF exceptions. The java file stream apis unfortunately may only > throw FNF, so permission denied becomes FNF resulting in this bug going > silently unnoticed. > (Problem discovered via public localizer. Files are downloaded as > user-readonly and then relaxed to all-read. The crc remains user-readonly) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13045) hadoop_add_classpath is not working in .hadooprc
[ https://issues.apache.org/jira/browse/HADOOP-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254582#comment-15254582 ] Akira AJISAKA commented on HADOOP-13045: bq. But it makes a lot of sense to have something that user's could use to trigger the api. It's a simple change. I'll write it up here in a sec. Agreed. Thanks a lot! > hadoop_add_classpath is not working in .hadooprc > > > Key: HADOOP-13045 > URL: https://issues.apache.org/jira/browse/HADOOP-13045 > Project: Hadoop Common > Issue Type: Bug >Reporter: Akira AJISAKA > > {{hadoop_basic_function}} resets {{CLASSPATH}} after {{.hadooprc}} is called. > {noformat} > $ hadoop --debug version > (snip) > DEBUG: Applying the user's .hadooprc > DEBUG: Initial CLASSPATH=/root/hadoop-tools-0.1-SNAPSHOT.jar > DEBUG: Initialize CLASSPATH > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/build/native > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/lib/native > DEBUG: Initial CLASSPATH=/usr/local/hadoop/share/hadoop/common/lib/* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13045) hadoop_add_classpath is not working in .hadooprc
[ https://issues.apache.org/jira/browse/HADOOP-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254577#comment-15254577 ] Allen Wittenauer commented on HADOOP-13045: --- Yeah, there really isn't a place. hadooprc was built as a way to override hadoop-env.sh and set things up prior to the bootstrap. But it makes a lot of sense to have *something* that user's could use to trigger the api. It's a simple change. I'll write it up here in a sec. > hadoop_add_classpath is not working in .hadooprc > > > Key: HADOOP-13045 > URL: https://issues.apache.org/jira/browse/HADOOP-13045 > Project: Hadoop Common > Issue Type: Bug >Reporter: Akira AJISAKA > > {{hadoop_basic_function}} resets {{CLASSPATH}} after {{.hadooprc}} is called. > {noformat} > $ hadoop --debug version > (snip) > DEBUG: Applying the user's .hadooprc > DEBUG: Initial CLASSPATH=/root/hadoop-tools-0.1-SNAPSHOT.jar > DEBUG: Initialize CLASSPATH > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/build/native > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/lib/native > DEBUG: Initial CLASSPATH=/usr/local/hadoop/share/hadoop/common/lib/* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13052) ChecksumFileSystem mishandles crc file permissions
[ https://issues.apache.org/jira/browse/HADOOP-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254559#comment-15254559 ] Kihwal Lee commented on HADOOP-13052: - +1 lgtm > ChecksumFileSystem mishandles crc file permissions > -- > > Key: HADOOP-13052 > URL: https://issues.apache.org/jira/browse/HADOOP-13052 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HADOOP-13052.patch > > > CheckFileSystem does not override permission related calls to apply those > operations to the hidden crc files. Clients may be unable to read the crcs > if the file is created with strict permissions and then relaxed. > The checksum fs is designed to work with or w/o crcs present, so it silently > ignores FNF exceptions. The java file stream apis unfortunately may only > throw FNF, so permission denied becomes FNF resulting in this bug going > silently unnoticed. > (Problem discovered via public localizer. Files are downloaded as > user-readonly and then relaxed to all-read. The crc remains user-readonly) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12982) Document missing S3A and S3 properties
[ https://issues.apache.org/jira/browse/HADOOP-12982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254561#comment-15254561 ] Wei-Chiu Chuang commented on HADOOP-12982: -- Both are similar, but documents different properties. I posted a quick review there. > Document missing S3A and S3 properties > -- > > Key: HADOOP-12982 > URL: https://issues.apache.org/jira/browse/HADOOP-12982 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation, fs/s3, tools >Affects Versions: 2.8.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Minor > Attachments: HADOOP-12982.001.patch, HADOOP-12982.002.patch, > HADOOP-12982.003.patch > > > * S3: > ** {{fs.s3.buffer.dir}}, {{fs.s3.maxRetries}}, {{fs.s3.sleepTimeSeconds}}, > {{fs.s3.block.size}} not in the documentation > ** Note that {{fs.s3.buffer.dir}}, {{fs.s3.maxRetries}}, > {{fs.s3.sleepTimeSeconds}} are also used by S3N. > * S3A: > ** {{fs.s3a.server-side-encryption-algorithm}} and {{fs.s3a.block.size}} are > missing in core-default.xml and the documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12671) Inconsistent s3a configuration values and incorrect comments
[ https://issues.apache.org/jira/browse/HADOOP-12671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254558#comment-15254558 ] Wei-Chiu Chuang commented on HADOOP-12671: -- Thanks for the patch [~tianyin]. Could you also update the Hadoop-AWS docs (Hadoop-AWS module: Integration with Amazon Web Services)? These values are wrong in the doc. > Inconsistent s3a configuration values and incorrect comments > > > Key: HADOOP-12671 > URL: https://issues.apache.org/jira/browse/HADOOP-12671 > Project: Hadoop Common > Issue Type: Bug > Components: conf, documentation, fs/s3 >Affects Versions: 2.7.1, 2.6.2 >Reporter: Tianyin Xu >Assignee: Tianyin Xu > Attachments: HADOOP-12671.000.patch > > > The two values in [core-default.xml | > https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml] > are wrong. > {{fs.s3a.multipart.purge.age}} > {{fs.s3a.connection.timeout}} > {{fs.s3a.connection.establish.timeout}} > \\ > \\ > *1. {{fs.s3a.multipart.purge.age}}* > (in both {{2.6.2}} and {{2.7.1}}) > In [core-default.xml | > https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml], > the value is {{86400}} ({{24}} hours), while in the code it is {{14400}} > ({{4}} hours). > \\ > \\ > *2. {{fs.s3a.connection.timeout}}* > (only appear in {{2.6.2}}) > In [core-default.xml (2.6.2) | > https://hadoop.apache.org/docs/r2.6.2/hadoop-project-dist/hadoop-common/core-default.xml], > the value is {{5000}}, while in the code it is {{5}}. > {code} > // seconds until we give up on a connection to s3 > public static final String SOCKET_TIMEOUT = "fs.s3a.connection.timeout"; > public static final int DEFAULT_SOCKET_TIMEOUT = 5; > {code} > \\ > *3. {{fs.s3a.connection.establish.timeout}}* > (only appear in {{2.7.1}}) > In [core-default.xml (2.7.1)| > https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml], > the value is {{5000}}, while in the code it is {{5}}. > {code} > // seconds until we give up trying to establish a connection to s3 > public static final String ESTABLISH_TIMEOUT = > "fs.s3a.connection.establish.timeout"; > public static final int DEFAULT_ESTABLISH_TIMEOUT = 5; > {code} > \\ > btw, the code comments are wrong! The two parameters are in the unit of > *milliseconds* instead of *seconds*... > {code} > - // seconds until we give up on a connection to s3 > + // milliseconds until we give up on a connection to s3 > ... > - // seconds until we give up trying to establish a connection to s3 > + // milliseconds until we give up trying to establish a connection to s3 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12738) Create unit test to automatically compare Common related classes and core-default.xml
[ https://issues.apache.org/jira/browse/HADOOP-12738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254543#comment-15254543 ] Ray Chiang commented on HADOOP-12738: - Thanks [~iwasakims]! > Create unit test to automatically compare Common related classes and > core-default.xml > - > > Key: HADOOP-12738 > URL: https://issues.apache.org/jira/browse/HADOOP-12738 > Project: Hadoop Common > Issue Type: Test >Affects Versions: 2.7.1 >Reporter: Ray Chiang >Assignee: Ray Chiang >Priority: Minor > Labels: supportability > Attachments: HADOOP-12738.001.patch, HADOOP-12738.002.patch > > > Create a unit test that will automatically compare the fields in the various > Common related classes and core-default.xml. It should throw an error if a > property is missing in either the class or the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13045) hadoop_add_classpath is not working in .hadooprc
[ https://issues.apache.org/jira/browse/HADOOP-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254541#comment-15254541 ] Akira AJISAKA commented on HADOOP-13045: Thanks Allen for the comments. bq. In any case, HADOOP_USER_CLASSPATH should be used to do what you're trying to do. I got it. bq. What if we turned the current .hadooprc support into .hadoopenv and add a new .hadooprc hook that gets called after the initialization is done such that functions work? I thought that .hadooprc was where to use unix shell API, but it's not. Where can/should we use the API? If there are some places to use the API, I'll use there. I'd like to avoid a new .something, which maybe misunderstanding. If there are no places to use that functions, I'm okay to introduce a new .something. > hadoop_add_classpath is not working in .hadooprc > > > Key: HADOOP-13045 > URL: https://issues.apache.org/jira/browse/HADOOP-13045 > Project: Hadoop Common > Issue Type: Bug >Reporter: Akira AJISAKA > > {{hadoop_basic_function}} resets {{CLASSPATH}} after {{.hadooprc}} is called. > {noformat} > $ hadoop --debug version > (snip) > DEBUG: Applying the user's .hadooprc > DEBUG: Initial CLASSPATH=/root/hadoop-tools-0.1-SNAPSHOT.jar > DEBUG: Initialize CLASSPATH > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/build/native > DEBUG: Rejected colonpath(JAVA_LIBRARY_PATH): /usr/local/hadoop/lib/native > DEBUG: Initial CLASSPATH=/usr/local/hadoop/share/hadoop/common/lib/* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13053) FS Shell should use File system API, not FileContext
[ https://issues.apache.org/jira/browse/HADOOP-13053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated HADOOP-13053: - Status: Patch Available (was: Open) > FS Shell should use File system API, not FileContext > > > Key: HADOOP-13053 > URL: https://issues.apache.org/jira/browse/HADOOP-13053 > Project: Hadoop Common > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: HADOOP-13053.001.patch > > > FS Shell is File System based, but it is using the FileContext API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13054) Use proto delimited IO to fix tests broken by HADOOP-12563
[ https://issues.apache.org/jira/browse/HADOOP-13054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Paduano updated HADOOP-13054: - Resolution: Duplicate Status: Resolved (was: Patch Available) > Use proto delimited IO to fix tests broken by HADOOP-12563 > -- > > Key: HADOOP-13054 > URL: https://issues.apache.org/jira/browse/HADOOP-13054 > Project: Hadoop Common > Issue Type: Bug >Reporter: Matthew Paduano >Assignee: Matthew Paduano > Attachments: HADOOP-13054.01.patch > > > HADOOP-12563 broke some unittests > (see that ticket, in comments). Switching the proto buffer read/write > methods to "writeDelimitedTo" and "readDelimitedFrom" seems to fix things up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13054) Use proto delimited IO to fix tests broken by HADOOP-12563
[ https://issues.apache.org/jira/browse/HADOOP-13054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254523#comment-15254523 ] Matthew Paduano commented on HADOOP-13054: -- 12563 was reverted. a new patch was submitted there including this edit. this ticket is obsolete. > Use proto delimited IO to fix tests broken by HADOOP-12563 > -- > > Key: HADOOP-13054 > URL: https://issues.apache.org/jira/browse/HADOOP-13054 > Project: Hadoop Common > Issue Type: Bug >Reporter: Matthew Paduano >Assignee: Matthew Paduano > Attachments: HADOOP-13054.01.patch > > > HADOOP-12563 broke some unittests > (see that ticket, in comments). Switching the proto buffer read/write > methods to "writeDelimitedTo" and "readDelimitedFrom" seems to fix things up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13010) Refactor raw erasure coders
[ https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254521#comment-15254521 ] Kai Zheng commented on HADOOP-13010: The change look good in following the discussion, some comments: 1. ErasureCoderOptions may be prepared as part of HH coder's initialization work? You don't have to pass {{false, false}} as they're the default. {code} private RawErasureEncoder checkCreateXorRawEncoder() { if (xorRawEncoder == null) { + ErasureCoderOptions erasureCoderOptions = new ErasureCoderOptions( + getNumDataUnits(), getNumParityUnits(), false, false); xorRawEncoder = CodecUtil.createXORRawEncoder(getConf(), - getNumDataUnits(), getNumParityUnits()); - xorRawEncoder.setCoderOption(CoderOption.ALLOW_CHANGE_INPUTS, false); + erasureCoderOptions); } return xorRawEncoder; } {code} 2. Missed a point and need to address: {{CoderUtil:convertInputBuffer}}. Ref. comment from Colin above, it was suggested to be renamed as: cloneAsDirectByteBuffer. 3. Missed a point and need to address: {{makeValidIndexes}} needs to be renamed to {{getNullIndexes}} and consistent with others, ref. above relevant discussions. 4. {{ErasureCoderOptions conf}} better to be: {{ErasureCoderOptions coderOptions}}, please check all places thru the large change. Sounds good to do the similar refactoring for the block level coders and do it separately. Thanks Rui for the major help and work! > Refactor raw erasure coders > --- > > Key: HADOOP-13010 > URL: https://issues.apache.org/jira/browse/HADOOP-13010 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Kai Zheng > Fix For: 3.0.0 > > Attachments: HADOOP-13010-v1.patch, HADOOP-13010-v2.patch, > HADOOP-13010-v3.patch, HADOOP-13010-v4.patch > > > This will refactor raw erasure coders according to some comments received so > far. > * As discussed in HADOOP-11540 and suggested by [~cmccabe], better not to > rely class inheritance to reuse the codes, instead they can be moved to some > utility. > * Suggested by [~jingzhao] somewhere quite some time ago, better to have a > state holder to keep some checking results for later reuse during an > encode/decode call. > This would not get rid of some inheritance levels as doing so isn't clear yet > for the moment and also incurs big impact. I do wish the end result by this > refactoring will make all the levels more clear and easier to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12916) Allow RPC scheduler/callqueue backoff using response times
[ https://issues.apache.org/jira/browse/HADOOP-12916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254495#comment-15254495 ] Kihwal Lee commented on HADOOP-12916: - For namenode RPC server, doesn't the rpc response time include the namesystem lock wait time? It may not work correctly for namenode then. > Allow RPC scheduler/callqueue backoff using response times > -- > > Key: HADOOP-12916 > URL: https://issues.apache.org/jira/browse/HADOOP-12916 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc >Reporter: Xiaoyu Yao >Assignee: Xiaoyu Yao > Fix For: 2.8.0 > > Attachments: HADOOP-12916.00.patch, HADOOP-12916.01.patch, > HADOOP-12916.02.patch, HADOOP-12916.03.patch, HADOOP-12916.04.patch, > HADOOP-12916.05.patch, HADOOP-12916.06.patch, HADOOP-12916.07.patch, > HADOOP-12916.08.patch > > > Currently back off policy from HADOOP-10597 is hard coded to base on whether > call queue is full. This ticket is open to allow flexible back off policies > such as moving average of response time in RPC calls of different priorities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12563) Updated utility to create/modify token files
[ https://issues.apache.org/jira/browse/HADOOP-12563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Paduano updated HADOOP-12563: - Attachment: HADOOP-12563.14.patch using writeDelimitedTo/readDelimitedFrom in the proto IO: {code} --- a/HADOOP-12563.13.patch +++ b/HADOOP-12563.14.patch @@ -322,7 +322,7 @@ index e6b8722..662eb3e 100644 + setSecret(ByteString.copyFrom(e.getValue())); + storage.addSecrets(kv.build()); +} -+storage.build().writeTo((DataOutputStream)out); ++storage.build().writeDelimitedTo((DataOutputStream)out); + } + /** @@ -331,7 +331,7 @@ index e6b8722..662eb3e 100644 + * @param in - stream ready to read a serialized proto buffer message + */ + public void readProtos(DataInput in) throws IOException { -+CredentialsProto storage = CredentialsProto.parseFrom((DataInputStream)in) ++CredentialsProto storage = CredentialsProto.parseDelimitedFrom((DataInputS +for (CredentialsKVProto kv : storage.getTokensList()) { + addToken(new Text(kv.getAliasBytes().toByteArray()), + (Token) new Token(kv.getToken())); {code} > Updated utility to create/modify token files > > > Key: HADOOP-12563 > URL: https://issues.apache.org/jira/browse/HADOOP-12563 > Project: Hadoop Common > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Matthew Paduano > Fix For: 3.0.0 > > Attachments: HADOOP-12563.01.patch, HADOOP-12563.02.patch, > HADOOP-12563.03.patch, HADOOP-12563.04.patch, HADOOP-12563.05.patch, > HADOOP-12563.06.patch, HADOOP-12563.07.patch, HADOOP-12563.07.patch, > HADOOP-12563.08.patch, HADOOP-12563.09.patch, HADOOP-12563.10.patch, > HADOOP-12563.11.patch, HADOOP-12563.12.patch, HADOOP-12563.13.patch, > HADOOP-12563.14.patch, dtutil-test-out, > example_dtutil_commands_and_output.txt, generalized_token_case.pdf > > > hdfs fetchdt is missing some critical features and is geared almost > exclusively towards HDFS operations. Additionally, the token files that are > created use Java serializations which are hard/impossible to deal with in > other languages. It should be replaced with a better utility in common that > can read/write protobuf-based token files, has enough flexibility to be used > with other services, and offers key functionality such as append and rename. > The old version file format should still be supported for backward > compatibility, but will be effectively deprecated. > A follow-on JIRA will deprecrate fetchdt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13010) Refactor raw erasure coders
[ https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HADOOP-13010: --- Status: Patch Available (was: Open) > Refactor raw erasure coders > --- > > Key: HADOOP-13010 > URL: https://issues.apache.org/jira/browse/HADOOP-13010 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Kai Zheng > Fix For: 3.0.0 > > Attachments: HADOOP-13010-v1.patch, HADOOP-13010-v2.patch, > HADOOP-13010-v3.patch, HADOOP-13010-v4.patch > > > This will refactor raw erasure coders according to some comments received so > far. > * As discussed in HADOOP-11540 and suggested by [~cmccabe], better not to > rely class inheritance to reuse the codes, instead they can be moved to some > utility. > * Suggested by [~jingzhao] somewhere quite some time ago, better to have a > state holder to keep some checking results for later reuse during an > encode/decode call. > This would not get rid of some inheritance levels as doing so isn't clear yet > for the moment and also incurs big impact. I do wish the end result by this > refactoring will make all the levels more clear and easier to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13010) Refactor raw erasure coders
[ https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HADOOP-13010: --- Status: Open (was: Patch Available) > Refactor raw erasure coders > --- > > Key: HADOOP-13010 > URL: https://issues.apache.org/jira/browse/HADOOP-13010 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Kai Zheng > Fix For: 3.0.0 > > Attachments: HADOOP-13010-v1.patch, HADOOP-13010-v2.patch, > HADOOP-13010-v3.patch, HADOOP-13010-v4.patch > > > This will refactor raw erasure coders according to some comments received so > far. > * As discussed in HADOOP-11540 and suggested by [~cmccabe], better not to > rely class inheritance to reuse the codes, instead they can be moved to some > utility. > * Suggested by [~jingzhao] somewhere quite some time ago, better to have a > state holder to keep some checking results for later reuse during an > encode/decode call. > This would not get rid of some inheritance levels as doing so isn't clear yet > for the moment and also incurs big impact. I do wish the end result by this > refactoring will make all the levels more clear and easier to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13021) Hadoop swift driver unit test should use unique directory for each run
[ https://issues.apache.org/jira/browse/HADOOP-13021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254431#comment-15254431 ] Chen He commented on HADOOP-13021: -- Thank you for the reply, [~ste...@apache.org]. I agree with you. However, there could be corner case such as JVM crashes or unit tests terminated in outage. Even we set different value for each machine, for example, machine A has its own bucket. Because of previous outage, there is some leftover directories or files. The next unit test run incline to report error. I propose we use some timestamp for those hard values. Combining your suggest, we can guarantee that in every time unit test runs on every machine, they are using different hard values. Then, we may be little bit safer than current solution. > Hadoop swift driver unit test should use unique directory for each run > -- > > Key: HADOOP-13021 > URL: https://issues.apache.org/jira/browse/HADOOP-13021 > Project: Hadoop Common > Issue Type: Bug > Components: fs/swift >Affects Versions: 2.7.2 >Reporter: Chen He >Assignee: Chen He > Labels: unit-test > > Since all "unit test" in swift package are actually functionality test, it > requires server's information in the core-site.xml file. However, multiple > unit test runs on difference machines using the same core-site.xml file will > result in some unit tests failure. For example: > In TestSwiftFileSystemBasicOps.java > public void testMkDir() throws Throwable { > Path path = new Path("/test/MkDir"); > fs.mkdirs(path); > //success then -so try a recursive operation > fs.delete(path, true); > } > It is possible that machine A and B are running "mvn clean install" using > same core-site.xml file. However, machine A run testMkDir() first and delete > the dir, but machine B just tried to run fs.delete(path,true). It will report > failure. This is just an example. There are many similar cases in the unit > test sets. I would propose we use a unique dir for each unit test run instead > of using "Path path = new Path("/test/MkDir")" for all concurrent runs -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13028) add counter and timer metrics for S3A HTTP & low-level operations
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254420#comment-15254420 ] Steve Loughran commented on HADOOP-13028: - Output of test run against a test CSV dataset against Amazon US, highlighting how in the test infrastructure (my laptop) the time to set up a read at a new offset is ~1s, so forward seeking through reading bytes should be much less expensive. {code} = TEST OUTPUT FOR o.a.s.cloud.s3.S3aIOSuite: 'SeekReadFully: Cost of seek and ReadFully' = 2016-04-22 19:19:20,295 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of stat = 206,270,000 ns 2016-04-22 19:19:20,486 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of open = 190,536,000 ns 2016-04-22 19:19:20,487 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-04-22 19:19:20,926 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of read() [pos = 0] = 438,378,000 ns 2016-04-22 19:19:20,926 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1 nextReadPos=1 contentLength=20320279} 2016-04-22 19:19:20,927 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AFileSystem{uri=s3a://landsat-pds, workingDir=s3a://landsat-pds/user/stevel, partSize=104857600, enableMultiObjectsDelete=true, multiPartThreshold=2147483647, serverSideEncryptionAlgorithm='null', statistics {1 bytes read, 0 bytes written, 2 read ops, 0 large read ops, 0 write ops}, metrics {{Context=S3AFileSystem} {FileSystemId=9042fe44-6438-4cc5-b3bf-d594dc71e699} {streamOpened=1} {streamCloseOperations=0} {streamClosed=0} {streamAborted=0} {streamSeekOperations=0} {readExceptions=0} {forwardSeekOperations=0} {backwardSeekOperations=0} {bytesSkippedOnSeek=0} {files_created=0} {files_copied=0} {files_copied_bytes=0} {files_deleted=0} {directories_created=0} {directories_deleted=0} {ignored_errors=0} }} 2016-04-22 19:19:20,927 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-04-22 19:19:20,927 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of seek(256) [pos = 1] = 14,000 ns 2016-04-22 19:19:20,928 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1 nextReadPos=256 contentLength=20320279} 2016-04-22 19:19:20,928 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AFileSystem{uri=s3a://landsat-pds, workingDir=s3a://landsat-pds/user/stevel, partSize=104857600, enableMultiObjectsDelete=true, multiPartThreshold=2147483647, serverSideEncryptionAlgorithm='null', statistics {1 bytes read, 0 bytes written, 2 read ops, 0 large read ops, 0 write ops}, metrics {{Context=S3AFileSystem} {FileSystemId=9042fe44-6438-4cc5-b3bf-d594dc71e699} {streamOpened=1} {streamCloseOperations=0} {streamClosed=0} {streamAborted=0} {streamSeekOperations=0} {readExceptions=0} {forwardSeekOperations=0} {backwardSeekOperations=0} {bytesSkippedOnSeek=0} {files_created=0} {files_copied=0} {files_copied_bytes=0} {files_deleted=0} {directories_created=0} {directories_deleted=0} {ignored_errors=0} }} 2016-04-22 19:19:20,928 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-04-22 19:19:20,929 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of seek(256) [pos = 256] = 8,000 ns 2016-04-22 19:19:20,930 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1 nextReadPos=256 contentLength=20320279} 2016-04-22 19:19:20,930 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AFileSystem{uri=s3a://landsat-pds, workingDir=s3a://landsat-pds/user/stevel, partSize=104857600, enableMultiObjectsDelete=true, multiPartThreshold=2147483647, serverSideEncryptionAlgorithm='null', statistics {1 bytes read, 0 bytes written, 2 read ops, 0 large read ops, 0 write ops}, metrics {{Context=S3AFileSystem} {FileSystemId=9042fe44-6438-4cc5-b3bf-d594dc71e699} {streamOpened=1} {streamCloseOperations=0} {streamClosed=0} {streamAborted=0} {streamSeekOperations=0} {readExceptions=0} {forwardSeekOperations=0} {backwardSeekOperations=0} {bytesSkippedOnSeek=0} {files_created=0} {files_copied=0} {files_copied_bytes=0} {files_deleted=0} {directories_created=0} {directories_deleted=0} {ignored_errors=0} }} 2016-04-22 19:19:20,930 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-04-22 19:19:20,930 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of seek(EOF-2) [pos = 256] = 11,000 ns 2016-04-22 19:19:20,930 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1 nextReadPos=20320277 contentLength=20320279} 2016-04-22 19:19:20,931 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AFileSystem{uri=s3a://landsat-pds, workingDir=s3a://landsat-pds/user/stevel, partSize=104857600, enableMultiObjectsDelete=true, multiPartThreshold=2147483647, serverSideEncryptionAlgorithm='null', statistics {1 bytes read,
[jira] [Updated] (HADOOP-13028) add counter and timer metrics for S3A HTTP & low-level operations
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-13028: Attachment: HADOOP-13028-001.patch Patch 001 # lifted and tuned {{AzureFileSystemInstrumentation}} # removed the gauges of current bandwidth and things that s3 doesn't offer # added counters for the stream-level operations # added counters for copy operations and size of files # went through S3A FS and input stream invoking the instrumentation operations as required # cleaned up the the debug statements to be fully "SLF4J" in the process. > add counter and timer metrics for S3A HTTP & low-level operations > - > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Attachments: HADOOP-13028-001.patch > > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13028) add counter and timer metrics for S3A HTTP & low-level operations
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254401#comment-15254401 ] Steve Loughran commented on HADOOP-13028: - colin, about to push up my patch # Nobody had told me of HDFS-10175, never mind # I'm using the classic MetricsRegistry, with all the instrumentation lifted from Azure, made the text/keys more generic, so the counters could be used for other object stores # added a metrics to string builder, so the S3AFileSystem. toString() operation can just do a complete dump of the stats. This is handy as it lets me print out the statistics of a run even with code built against older Hadoop versions. # Note that in the object stores, its not so much "per FS method" we're counting, but "per object store API method". E.g. We're counting the number of copy operations in a rename; the number of bytes copied remotely, the deletes that take place there, etc, etc. Because this code is the usual metrics stuff, it slots in quite nicely to what there already is. It does add one class to Hadoop common, MetricStringBuilder, which I've put there for its generic usability. {code} S3AFileSystem{uri=s3a://landsat-pds, workingDir=s3a://landsat-pds/user/stevel, partSize=104857600, enableMultiObjectsDelete=true, multiPartThreshold=2147483647, serverSideEncryptionAlgorithm='null', statistics {3843 bytes read, 0 bytes written, 2 read ops, 0 large read ops, 0 write ops}, metrics {{Context=S3AFileSystem} {FileSystemId=9042fe44-6438-4cc5-b3bf-d594dc71e699} {streamOpened=7} {streamCloseOperations=6} {streamClosed=1} {streamAborted=5} {streamSeekOperations=5} {readExceptions=0} {forwardSeekOperations=3} {backwardSeekOperations=2} {bytesSkippedOnSeek=767} {files_created=0} {files_copied=0} {files_copied_bytes=0} {files_deleted=0} {directories_created=0} {directories_deleted=0} {ignored_errors=0} }} {code} > add counter and timer metrics for S3A HTTP & low-level operations > - > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13028) add counter and timer metrics for S3A HTTP & low-level operations
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254358#comment-15254358 ] Colin Patrick McCabe commented on HADOOP-13028: --- Hi [~steve_l], This is a really interesting idea. I think this ties in with some of the discussions we've been having on HDFS-10175 with adding a way to fetch arbitrary statistics from FileSystem (and FileContext) instances. Basically, HDFS-10175 provides a way for MR to enumerate all the statistics and their values. It also provides interfaces for finding just one statistic, of course. This would also enable the use of those statistics in unit tests, since the stats could be per-FS rather than global per type. > add counter and timer metrics for S3A HTTP & low-level operations > - > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12291) Add support for nested groups in LdapGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254349#comment-15254349 ] Anu Engineer commented on HADOOP-12291: --- [~ekundin] Thank you very much for providing this patch and taking care of most jenkins issues in patch 2. I have some minor comments on Patch 2. # Do we need -1 at all? In most cases it will not work and really depends on the size of directory we are operating against. Since we know that it is not going to work or too slow in most cases, why support it ? My worry is that this will be used by some customer and will create very slow clusters. Can we please reduce this to positive key depth only ? # what would be the impact of DIRECTORY_SEARCH_TIMEOUT with a positive depth? Does it bail after the time out seconds or does it measure timeout independently for each recursive query? if so, could you please define what is the right semantics here? # In {{LdapGroupsMapping.java:line 312}} : We add the groups to a list for all queries, but this is needed if the goUpHierarchy is != 0. Would you please add an if check? This is just to make sure that this code change does not change the memory usage if this feature is not enabled. # In {{LdapGroupsMapping#goUpGroupHierarchy}} nitpick: can we please remove the reference to the JIRA number? "for HADOOP-12291", when we commit this patch, we will refer to it. So it may not be needed in comments # nitpick: do you want to rewrite this to be {code} int nextLevel = 0; if (goUpHierarchy == -1) { nextLevel = -1; } else { nextLevel = goUpHierarchy -1; } {code} into {code} int nextLevel = (goUpHierarchy == -1) ? -1: goUpHierarchy -1; {code} Plus , Can you please define -1 as const like INFINITE_RECURSE = -1, so that code reading is easier ? or better just remove this INIFITE_RECURSE capability completely from code ? # nitpick : would you like to pull this out as a function ? {code} while (groupResults.hasMoreElements()) { SearchResult groupResult = groupResults.nextElement(); Attribute groupName = groupResult.getAttributes().get(groupNameAttr); groups.add(groupName.get().toString()); groupDNs.add(groupResult.getNameInNamespace()); } {code} # Do you think we should check for the goUpHierarchy == 0 before doing a LDAP query since queries are generally expensive. I may be mistaken but I think you can optimize away one query call if you check for the value little earlier. # nitpick : Please feel free to ignore this. But we seem to be mixing StringBuilder.append and String Concat. If we are using StringBuilder could we possible use appends all along instead of creating an unnecessary string. I know that this is the style used in this file and you are just following it, thought I would flag it for your consideration. {code} filter.append("(&" + groupSearchFilter + "(|"); {code} # In TestLadpGroupMapping, Can you please use {{conf.setInt(LdapGroupsMapping.GROUP_HIERARCHY_LEVELS_KEY,1);}} instead of {{conf.set(LdapGroupsMapping.GROUP_HIERARCHY_LEVELS_KEY, "1");}} # In the next patch would you please take care of this last checkstyle warning: ./hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/LdapGroupsMapping.java:368: }:5: '}' should be on the same line. > Add support for nested groups in LdapGroupsMapping > -- > > Key: HADOOP-12291 > URL: https://issues.apache.org/jira/browse/HADOOP-12291 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.8.0 >Reporter: Gautam Gopalakrishnan >Assignee: Esther Kundin > Labels: features, patch > Fix For: 2.8.0 > > Attachments: HADOOP-12291.001.patch, HADOOP-12291.002.patch > > > When using {{LdapGroupsMapping}} with Hadoop, nested groups are not > supported. So for example if user {{jdoe}} is part of group A which is a > member of group B, the group mapping currently returns only group A. > Currently this facility is available with {{ShellBasedUnixGroupsMapping}} and > SSSD (or similar tools) but would be good to have this feature as part of > {{LdapGroupsMapping}} directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-13055) Implement linkMergeSlash for ViewFs
Zhe Zhang created HADOOP-13055: -- Summary: Implement linkMergeSlash for ViewFs Key: HADOOP-13055 URL: https://issues.apache.org/jira/browse/HADOOP-13055 Project: Hadoop Common Issue Type: New Feature Components: fs, viewfs Reporter: Zhe Zhang Assignee: Zhe Zhang In a multi-cluster environment it is sometimes useful to operate on the root / slash directory of an HDFS cluster. E.g., list all top level directories. Quoting the comment in {{ViewFs}}: {code} * A special case of the merge mount is where mount table's root is merged * with the root (slash) of another file system: * * fs.viewfs.mounttable.default.linkMergeSlash=hdfs://nn99/ * * In this cases the root of the mount table is merged with the root of *hdfs://nn99/ {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12563) Updated utility to create/modify token files
[ https://issues.apache.org/jira/browse/HADOOP-12563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254289#comment-15254289 ] Hudson commented on HADOOP-12563: - FAILURE: Integrated in Hadoop-trunk-Commit #9655 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9655/]) Revert "HADOOP-12563. Updated utility (dtutil) to create/modify token (raviprak: rev d6402fadedade4289949ba9f70f7a0bfb9bca140) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Credentials.java * hadoop-common-project/hadoop-common/src/site/markdown/CommandsManual.md * hadoop-common-project/hadoop-common/src/main/proto/Security.proto * hadoop-common-project/hadoop-common/src/test/resources/META-INF/services/org.apache.hadoop.security.token.DtFetcher * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/DtUtilShell.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/DtFetcher.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/tools/CommandShell.java * hadoop-common-project/hadoop-common/src/main/bin/hadoop * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/token/TestDtUtilShell.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/DtFileOperations.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsDtFetcher.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/SWebHdfsDtFetcher.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/tools/TestCommandShell.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/token/TestDtFetcher.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/WebHdfsDtFetcher.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/META-INF/services/org.apache.hadoop.security.token.DtFetcher * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/Token.java > Updated utility to create/modify token files > > > Key: HADOOP-12563 > URL: https://issues.apache.org/jira/browse/HADOOP-12563 > Project: Hadoop Common > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Matthew Paduano > Fix For: 3.0.0 > > Attachments: HADOOP-12563.01.patch, HADOOP-12563.02.patch, > HADOOP-12563.03.patch, HADOOP-12563.04.patch, HADOOP-12563.05.patch, > HADOOP-12563.06.patch, HADOOP-12563.07.patch, HADOOP-12563.07.patch, > HADOOP-12563.08.patch, HADOOP-12563.09.patch, HADOOP-12563.10.patch, > HADOOP-12563.11.patch, HADOOP-12563.12.patch, HADOOP-12563.13.patch, > dtutil-test-out, example_dtutil_commands_and_output.txt, > generalized_token_case.pdf > > > hdfs fetchdt is missing some critical features and is geared almost > exclusively towards HDFS operations. Additionally, the token files that are > created use Java serializations which are hard/impossible to deal with in > other languages. It should be replaced with a better utility in common that > can read/write protobuf-based token files, has enough flexibility to be used > with other services, and offers key functionality such as append and rename. > The old version file format should still be supported for backward > compatibility, but will be effectively deprecated. > A follow-on JIRA will deprecrate fetchdt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13054) Use proto delimited IO to fix tests broken by HADOOP-12563
[ https://issues.apache.org/jira/browse/HADOOP-13054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254248#comment-15254248 ] Hadoop QA commented on HADOOP-13054: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} | {color:red} HADOOP-13054 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12800254/HADOOP-13054.01.patch | | JIRA Issue | HADOOP-13054 | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/9152/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Use proto delimited IO to fix tests broken by HADOOP-12563 > -- > > Key: HADOOP-13054 > URL: https://issues.apache.org/jira/browse/HADOOP-13054 > Project: Hadoop Common > Issue Type: Bug >Reporter: Matthew Paduano >Assignee: Matthew Paduano > Attachments: HADOOP-13054.01.patch > > > HADOOP-12563 broke some unittests > (see that ticket, in comments). Switching the proto buffer read/write > methods to "writeDelimitedTo" and "readDelimitedFrom" seems to fix things up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12563) Updated utility to create/modify token files
[ https://issues.apache.org/jira/browse/HADOOP-12563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254244#comment-15254244 ] Matthew Paduano commented on HADOOP-12563: -- please see HADOOP-13054 where I attached a patch to fix this issue. > Updated utility to create/modify token files > > > Key: HADOOP-12563 > URL: https://issues.apache.org/jira/browse/HADOOP-12563 > Project: Hadoop Common > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Matthew Paduano > Fix For: 3.0.0 > > Attachments: HADOOP-12563.01.patch, HADOOP-12563.02.patch, > HADOOP-12563.03.patch, HADOOP-12563.04.patch, HADOOP-12563.05.patch, > HADOOP-12563.06.patch, HADOOP-12563.07.patch, HADOOP-12563.07.patch, > HADOOP-12563.08.patch, HADOOP-12563.09.patch, HADOOP-12563.10.patch, > HADOOP-12563.11.patch, HADOOP-12563.12.patch, HADOOP-12563.13.patch, > dtutil-test-out, example_dtutil_commands_and_output.txt, > generalized_token_case.pdf > > > hdfs fetchdt is missing some critical features and is geared almost > exclusively towards HDFS operations. Additionally, the token files that are > created use Java serializations which are hard/impossible to deal with in > other languages. It should be replaced with a better utility in common that > can read/write protobuf-based token files, has enough flexibility to be used > with other services, and offers key functionality such as append and rename. > The old version file format should still be supported for backward > compatibility, but will be effectively deprecated. > A follow-on JIRA will deprecrate fetchdt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HADOOP-12563) Updated utility to create/modify token files
[ https://issues.apache.org/jira/browse/HADOOP-12563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash reopened HADOOP-12563: --- > Updated utility to create/modify token files > > > Key: HADOOP-12563 > URL: https://issues.apache.org/jira/browse/HADOOP-12563 > Project: Hadoop Common > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Matthew Paduano > Fix For: 3.0.0 > > Attachments: HADOOP-12563.01.patch, HADOOP-12563.02.patch, > HADOOP-12563.03.patch, HADOOP-12563.04.patch, HADOOP-12563.05.patch, > HADOOP-12563.06.patch, HADOOP-12563.07.patch, HADOOP-12563.07.patch, > HADOOP-12563.08.patch, HADOOP-12563.09.patch, HADOOP-12563.10.patch, > HADOOP-12563.11.patch, HADOOP-12563.12.patch, HADOOP-12563.13.patch, > dtutil-test-out, example_dtutil_commands_and_output.txt, > generalized_token_case.pdf > > > hdfs fetchdt is missing some critical features and is geared almost > exclusively towards HDFS operations. Additionally, the token files that are > created use Java serializations which are hard/impossible to deal with in > other languages. It should be replaced with a better utility in common that > can read/write protobuf-based token files, has enough flexibility to be used > with other services, and offers key functionality such as append and rename. > The old version file format should still be supported for backward > compatibility, but will be effectively deprecated. > A follow-on JIRA will deprecrate fetchdt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13054) Use proto delimited IO to fix tests broken by HADOOP-12563
[ https://issues.apache.org/jira/browse/HADOOP-13054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Paduano updated HADOOP-13054: - Attachment: HADOOP-13054.01.patch patch to fix protobuf IO problem in Credentials class > Use proto delimited IO to fix tests broken by HADOOP-12563 > -- > > Key: HADOOP-13054 > URL: https://issues.apache.org/jira/browse/HADOOP-13054 > Project: Hadoop Common > Issue Type: Bug >Reporter: Matthew Paduano >Assignee: Matthew Paduano > Attachments: HADOOP-13054.01.patch > > > HADOOP-12563 broke some unittests > (see that ticket, in comments). Switching the proto buffer read/write > methods to "writeDelimitedTo" and "readDelimitedFrom" seems to fix things up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12563) Updated utility to create/modify token files
[ https://issues.apache.org/jira/browse/HADOOP-12563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254242#comment-15254242 ] Ravi Prakash commented on HADOOP-12563: --- Reverting in the meantime. Matt please create a new patch. Thanks for the heads up Brahma and Bibin > Updated utility to create/modify token files > > > Key: HADOOP-12563 > URL: https://issues.apache.org/jira/browse/HADOOP-12563 > Project: Hadoop Common > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Matthew Paduano > Fix For: 3.0.0 > > Attachments: HADOOP-12563.01.patch, HADOOP-12563.02.patch, > HADOOP-12563.03.patch, HADOOP-12563.04.patch, HADOOP-12563.05.patch, > HADOOP-12563.06.patch, HADOOP-12563.07.patch, HADOOP-12563.07.patch, > HADOOP-12563.08.patch, HADOOP-12563.09.patch, HADOOP-12563.10.patch, > HADOOP-12563.11.patch, HADOOP-12563.12.patch, HADOOP-12563.13.patch, > dtutil-test-out, example_dtutil_commands_and_output.txt, > generalized_token_case.pdf > > > hdfs fetchdt is missing some critical features and is geared almost > exclusively towards HDFS operations. Additionally, the token files that are > created use Java serializations which are hard/impossible to deal with in > other languages. It should be replaced with a better utility in common that > can read/write protobuf-based token files, has enough flexibility to be used > with other services, and offers key functionality such as append and rename. > The old version file format should still be supported for backward > compatibility, but will be effectively deprecated. > A follow-on JIRA will deprecrate fetchdt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13054) Use proto delimited IO to fix tests broken by HADOOP-12563
[ https://issues.apache.org/jira/browse/HADOOP-13054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Paduano updated HADOOP-13054: - Status: Patch Available (was: Open) > Use proto delimited IO to fix tests broken by HADOOP-12563 > -- > > Key: HADOOP-13054 > URL: https://issues.apache.org/jira/browse/HADOOP-13054 > Project: Hadoop Common > Issue Type: Bug >Reporter: Matthew Paduano >Assignee: Matthew Paduano > > HADOOP-12563 broke some unittests > (see that ticket, in comments). Switching the proto buffer read/write > methods to "writeDelimitedTo" and "readDelimitedFrom" seems to fix things up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13054) Use proto delimited IO to fix tests broken by HADOOP-12563
[ https://issues.apache.org/jira/browse/HADOOP-13054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254239#comment-15254239 ] Matthew Paduano commented on HADOOP-13054: -- Bug reports on failing tests in comments of HADOOP-12563 > Use proto delimited IO to fix tests broken by HADOOP-12563 > -- > > Key: HADOOP-13054 > URL: https://issues.apache.org/jira/browse/HADOOP-13054 > Project: Hadoop Common > Issue Type: Bug >Reporter: Matthew Paduano >Assignee: Matthew Paduano > > HADOOP-12563 broke some unittests > (see that ticket, in comments). Switching the proto buffer read/write > methods to "writeDelimitedTo" and "readDelimitedFrom" seems to fix things up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-13054) Use proto delimited IO to fix tests broken by HADOOP-12563
Matthew Paduano created HADOOP-13054: Summary: Use proto delimited IO to fix tests broken by HADOOP-12563 Key: HADOOP-13054 URL: https://issues.apache.org/jira/browse/HADOOP-13054 Project: Hadoop Common Issue Type: Bug Reporter: Matthew Paduano Assignee: Matthew Paduano HADOOP-12563 broke some unittests (see that ticket, in comments). Switching the proto buffer read/write methods to "writeDelimitedTo" and "readDelimitedFrom" seems to fix things up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12563) Updated utility to create/modify token files
[ https://issues.apache.org/jira/browse/HADOOP-12563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254224#comment-15254224 ] Matthew Paduano commented on HADOOP-12563: -- thanks for that suggestion about delimited IO. that seems to fix the tests and original tests still pass. I will work with Ravi to get this patched up. > Updated utility to create/modify token files > > > Key: HADOOP-12563 > URL: https://issues.apache.org/jira/browse/HADOOP-12563 > Project: Hadoop Common > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Matthew Paduano > Fix For: 3.0.0 > > Attachments: HADOOP-12563.01.patch, HADOOP-12563.02.patch, > HADOOP-12563.03.patch, HADOOP-12563.04.patch, HADOOP-12563.05.patch, > HADOOP-12563.06.patch, HADOOP-12563.07.patch, HADOOP-12563.07.patch, > HADOOP-12563.08.patch, HADOOP-12563.09.patch, HADOOP-12563.10.patch, > HADOOP-12563.11.patch, HADOOP-12563.12.patch, HADOOP-12563.13.patch, > dtutil-test-out, example_dtutil_commands_and_output.txt, > generalized_token_case.pdf > > > hdfs fetchdt is missing some critical features and is geared almost > exclusively towards HDFS operations. Additionally, the token files that are > created use Java serializations which are hard/impossible to deal with in > other languages. It should be replaced with a better utility in common that > can read/write protobuf-based token files, has enough flexibility to be used > with other services, and offers key functionality such as append and rename. > The old version file format should still be supported for backward > compatibility, but will be effectively deprecated. > A follow-on JIRA will deprecrate fetchdt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12291) Add support for nested groups in LdapGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esther Kundin updated HADOOP-12291: --- Status: In Progress (was: Patch Available) Submitted patch version 002 > Add support for nested groups in LdapGroupsMapping > -- > > Key: HADOOP-12291 > URL: https://issues.apache.org/jira/browse/HADOOP-12291 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.8.0 >Reporter: Gautam Gopalakrishnan >Assignee: Esther Kundin > Labels: features, patch > Fix For: 2.8.0 > > Attachments: HADOOP-12291.001.patch, HADOOP-12291.002.patch > > > When using {{LdapGroupsMapping}} with Hadoop, nested groups are not > supported. So for example if user {{jdoe}} is part of group A which is a > member of group B, the group mapping currently returns only group A. > Currently this facility is available with {{ShellBasedUnixGroupsMapping}} and > SSSD (or similar tools) but would be good to have this feature as part of > {{LdapGroupsMapping}} directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12291) Add support for nested groups in LdapGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esther Kundin updated HADOOP-12291: --- Attachment: HADOOP-12291.002.patch > Add support for nested groups in LdapGroupsMapping > -- > > Key: HADOOP-12291 > URL: https://issues.apache.org/jira/browse/HADOOP-12291 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.8.0 >Reporter: Gautam Gopalakrishnan >Assignee: Esther Kundin > Labels: features, patch > Fix For: 2.8.0 > > Attachments: HADOOP-12291.001.patch, HADOOP-12291.002.patch > > > When using {{LdapGroupsMapping}} with Hadoop, nested groups are not > supported. So for example if user {{jdoe}} is part of group A which is a > member of group B, the group mapping currently returns only group A. > Currently this facility is available with {{ShellBasedUnixGroupsMapping}} and > SSSD (or similar tools) but would be good to have this feature as part of > {{LdapGroupsMapping}} directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12291) Add support for nested groups in LdapGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esther Kundin updated HADOOP-12291: --- Fix Version/s: 2.8.0 Status: Patch Available (was: Open) > Add support for nested groups in LdapGroupsMapping > -- > > Key: HADOOP-12291 > URL: https://issues.apache.org/jira/browse/HADOOP-12291 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.8.0 >Reporter: Gautam Gopalakrishnan >Assignee: Esther Kundin > Labels: features, patch > Fix For: 2.8.0 > > Attachments: HADOOP-12291.001.patch, HADOOP-12291.002.patch > > > When using {{LdapGroupsMapping}} with Hadoop, nested groups are not > supported. So for example if user {{jdoe}} is part of group A which is a > member of group B, the group mapping currently returns only group A. > Currently this facility is available with {{ShellBasedUnixGroupsMapping}} and > SSSD (or similar tools) but would be good to have this feature as part of > {{LdapGroupsMapping}} directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13047) S3a Forward seek in stream length to be configurable
[ https://issues.apache.org/jira/browse/HADOOP-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254168#comment-15254168 ] Steve Loughran commented on HADOOP-13047: - regarding the patch, I don't think we need to go to anything trying to be adaptive to bandwidth., at least not initially. Having something you can preconfigure should be enough at first. Why? For short lived streams, you aren't going to have any statisticsyet you may know across applications and instances of the app whether you are near/far from s3, so can choose some values and see what works > S3a Forward seek in stream length to be configurable > > > Key: HADOOP-13047 > URL: https://issues.apache.org/jira/browse/HADOOP-13047 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Steve Loughran > Attachments: HADOOP-13047.WIP.patch > > > Even with lazy seek, tests can show that sometimes a short-distance forward > seek is triggering a close + reopen, because the threshold for the seek is > simply available bytes in the inner stream. > A configurable threshold would allow data to be read and discarded before > that seek. This should be beneficial over long-haul networks as the time to > set up the TCP channel is high, and TCP-slow-start means that the ramp up of > bandwidth is slow. In such deployments, it will better to read forward than > re-open, though the exact "best" number will vary with client and endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13052) ChecksumFileSystem mishandles crc file permissions
[ https://issues.apache.org/jira/browse/HADOOP-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254142#comment-15254142 ] Hadoop QA commented on HADOOP-13052: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 39s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 30s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 32s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 32s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} hadoop-common-project/hadoop-common: patch generated 0 new + 35 unchanged - 3 fixed = 35 total (was 38) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 17s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 35s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 58m 18s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_77 Failed junit tests | hadoop.net.TestDNS | | JDK v1.7.0_95 Failed junit tests | hadoop.net.TestDNS | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12800225/HADOOP-13052.patch | | JIRA Issue | HADOOP-13052 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 784572e7c058 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool |
[jira] [Updated] (HADOOP-13053) FS Shell should use File system API, not FileContext
[ https://issues.apache.org/jira/browse/HADOOP-13053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated HADOOP-13053: - Attachment: HADOOP-13053.001.patch [~daryn], please review. > FS Shell should use File system API, not FileContext > > > Key: HADOOP-13053 > URL: https://issues.apache.org/jira/browse/HADOOP-13053 > Project: Hadoop Common > Issue Type: Bug >Reporter: Eric Badger > Attachments: HADOOP-13053.001.patch > > > FS Shell is File System based, but it is using the FileContext API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HADOOP-13053) FS Shell should use File system API, not FileContext
[ https://issues.apache.org/jira/browse/HADOOP-13053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reassigned HADOOP-13053: Assignee: Eric Badger > FS Shell should use File system API, not FileContext > > > Key: HADOOP-13053 > URL: https://issues.apache.org/jira/browse/HADOOP-13053 > Project: Hadoop Common > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: HADOOP-13053.001.patch > > > FS Shell is File System based, but it is using the FileContext API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-13053) FS Shell should use File system API, not FileContext
Eric Badger created HADOOP-13053: Summary: FS Shell should use File system API, not FileContext Key: HADOOP-13053 URL: https://issues.apache.org/jira/browse/HADOOP-13053 Project: Hadoop Common Issue Type: Bug Reporter: Eric Badger FS Shell is File System based, but it is using the FileContext API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13051) Test for special characters in path being respected during globPaths
[ https://issues.apache.org/jira/browse/HADOOP-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254125#comment-15254125 ] Hadoop QA commented on HADOOP-13051: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 58s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 32s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 58s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 167m 11s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.TestHFlush | | | hadoop.hdfs.server.datanode.TestDataNodeLifeline | | | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot | | | hadoop.hdfs.server.namenode.TestFileTruncate | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | | | hadoop.hdfs.server.namenode.TestEditLog | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12800211/HDFS-13051.000.patch | | JIRA Issue | HADOOP-13051 | | Optional Tests | asflicense compile javac javadoc mvninstall
[jira] [Updated] (HADOOP-13052) ChecksumFileSystem mishandles crc file permissions
[ https://issues.apache.org/jira/browse/HADOOP-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HADOOP-13052: - Status: Patch Available (was: Open) > ChecksumFileSystem mishandles crc file permissions > -- > > Key: HADOOP-13052 > URL: https://issues.apache.org/jira/browse/HADOOP-13052 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HADOOP-13052.patch > > > CheckFileSystem does not override permission related calls to apply those > operations to the hidden crc files. Clients may be unable to read the crcs > if the file is created with strict permissions and then relaxed. > The checksum fs is designed to work with or w/o crcs present, so it silently > ignores FNF exceptions. The java file stream apis unfortunately may only > throw FNF, so permission denied becomes FNF resulting in this bug going > silently unnoticed. > (Problem discovered via public localizer. Files are downloaded as > user-readonly and then relaxed to all-read. The crc remains user-readonly) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13052) ChecksumFileSystem mishandles crc file permissions
[ https://issues.apache.org/jira/browse/HADOOP-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HADOOP-13052: - Attachment: HADOOP-13052.patch > ChecksumFileSystem mishandles crc file permissions > -- > > Key: HADOOP-13052 > URL: https://issues.apache.org/jira/browse/HADOOP-13052 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.7.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HADOOP-13052.patch > > > CheckFileSystem does not override permission related calls to apply those > operations to the hidden crc files. Clients may be unable to read the crcs > if the file is created with strict permissions and then relaxed. > The checksum fs is designed to work with or w/o crcs present, so it silently > ignores FNF exceptions. The java file stream apis unfortunately may only > throw FNF, so permission denied becomes FNF resulting in this bug going > silently unnoticed. > (Problem discovered via public localizer. Files are downloaded as > user-readonly and then relaxed to all-read. The crc remains user-readonly) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-13052) ChecksumFileSystem mishandles crc file permissions
Daryn Sharp created HADOOP-13052: Summary: ChecksumFileSystem mishandles crc file permissions Key: HADOOP-13052 URL: https://issues.apache.org/jira/browse/HADOOP-13052 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.7.0 Reporter: Daryn Sharp Assignee: Daryn Sharp CheckFileSystem does not override permission related calls to apply those operations to the hidden crc files. Clients may be unable to read the crcs if the file is created with strict permissions and then relaxed. The checksum fs is designed to work with or w/o crcs present, so it silently ignores FNF exceptions. The java file stream apis unfortunately may only throw FNF, so permission denied becomes FNF resulting in this bug going silently unnoticed. (Problem discovered via public localizer. Files are downloaded as user-readonly and then relaxed to all-read. The crc remains user-readonly) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13047) S3a Forward seek in stream length to be configurable
[ https://issues.apache.org/jira/browse/HADOOP-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254046#comment-15254046 ] Steve Loughran commented on HADOOP-13047: - actually, your metrics setup is better than mine...let me see how I can lift them > S3a Forward seek in stream length to be configurable > > > Key: HADOOP-13047 > URL: https://issues.apache.org/jira/browse/HADOOP-13047 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Steve Loughran > Attachments: HADOOP-13047.WIP.patch > > > Even with lazy seek, tests can show that sometimes a short-distance forward > seek is triggering a close + reopen, because the threshold for the seek is > simply available bytes in the inner stream. > A configurable threshold would allow data to be read and discarded before > that seek. This should be beneficial over long-haul networks as the time to > set up the TCP channel is high, and TCP-slow-start means that the ramp up of > bandwidth is slow. In such deployments, it will better to read forward than > re-open, though the exact "best" number will vary with client and endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12563) Updated utility to create/modify token files
[ https://issues.apache.org/jira/browse/HADOOP-12563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254020#comment-15254020 ] Matthew Paduano commented on HADOOP-12563: -- I can look at this today/soon. > Updated utility to create/modify token files > > > Key: HADOOP-12563 > URL: https://issues.apache.org/jira/browse/HADOOP-12563 > Project: Hadoop Common > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Matthew Paduano > Fix For: 3.0.0 > > Attachments: HADOOP-12563.01.patch, HADOOP-12563.02.patch, > HADOOP-12563.03.patch, HADOOP-12563.04.patch, HADOOP-12563.05.patch, > HADOOP-12563.06.patch, HADOOP-12563.07.patch, HADOOP-12563.07.patch, > HADOOP-12563.08.patch, HADOOP-12563.09.patch, HADOOP-12563.10.patch, > HADOOP-12563.11.patch, HADOOP-12563.12.patch, HADOOP-12563.13.patch, > dtutil-test-out, example_dtutil_commands_and_output.txt, > generalized_token_case.pdf > > > hdfs fetchdt is missing some critical features and is geared almost > exclusively towards HDFS operations. Additionally, the token files that are > created use Java serializations which are hard/impossible to deal with in > other languages. It should be replaced with a better utility in common that > can read/write protobuf-based token files, has enough flexibility to be used > with other services, and offers key functionality such as append and rename. > The old version file format should still be supported for backward > compatibility, but will be effectively deprecated. > A follow-on JIRA will deprecrate fetchdt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HADOOP-13028) add counter and timer metrics for S3A HTTP & low-level operations
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HADOOP-13028 started by Steve Loughran. --- > add counter and timer metrics for S3A HTTP & low-level operations > - > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HADOOP-13028) add counter and timer metrics for S3A HTTP & low-level operations
[ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran reassigned HADOOP-13028: --- Assignee: Steve Loughran > add counter and timer metrics for S3A HTTP & low-level operations > - > > Key: HADOOP-13028 > URL: https://issues.apache.org/jira/browse/HADOOP-13028 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, metrics >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > > against S3 (and other object stores), opening connections can be expensive, > closing connections may be expensive (a sign of a regression). > S3A FS and individual input streams should have counters of the # of > open/close/failure+reconnect operations, timers of how long things take. This > can be used downstream to measure efficiency of the code (how often > connections are being made), connection reliability, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12751) While using kerberos Hadoop incorrectly assumes names with '@' to be non-simple
[ https://issues.apache.org/jira/browse/HADOOP-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253974#comment-15253974 ] Bolke de Bruin commented on HADOOP-12751: - Thanks. Normally I do have vm for this, just not now and I (wrongly) thought the tests would be a bit easier on me. Code is production with us. > While using kerberos Hadoop incorrectly assumes names with '@' to be > non-simple > --- > > Key: HADOOP-12751 > URL: https://issues.apache.org/jira/browse/HADOOP-12751 > Project: Hadoop Common > Issue Type: Bug > Components: security >Affects Versions: 2.7.2 > Environment: kerberos >Reporter: Bolke de Bruin >Assignee: Bolke de Bruin >Priority: Critical > Labels: kerberos > Attachments: 0001-HADOOP-12751-leave-user-validation-to-os.patch, > 0001-Remove-check-for-user-name-characters-and.patch, > 0002-HADOOP-12751-leave-user-validation-to-os.patch, > 0003-HADOOP-12751-leave-user-validation-to-os.patch, > 0004-HADOOP-12751-leave-user-validation-to-os.patch, > 0005-HADOOP-12751-leave-user-validation-to-os.patch, > 0006-HADOOP-12751-leave-user-validation-to-os.patch, > 0007-HADOOP-12751-leave-user-validation-to-os.patch, > 0007-HADOOP-12751-leave-user-validation-to-os.patch, > 0008-HADOOP-12751-leave-user-validation-to-os.patch > > > In the scenario of a trust between two directories, eg. FreeIPA (ipa.local) > and Active Directory (ad.local) users can be made available on the OS level > by something like sssd. The trusted users will be of the form 'user@ad.local' > while other users are will not contain the domain. Executing 'id -Gn > user@ad.local' will successfully return the groups the user belongs to if > configured correctly. > However, it is assumed by Hadoop that users of the format with '@' cannot be > correct. This code is in KerberosName.java and seems to be a validator if the > 'auth_to_local' rules are applied correctly. > In my opinion this should be removed or changed to a different kind of check > or maybe logged as a warning while still proceeding, as the current behavior > limits integration possibilities with other standard tools. > Workaround are difficult to apply (by having a rewrite by system tools to for > example user_ad_local) due to down stream consequences. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12291) Add support for nested groups in LdapGroupsMapping
[ https://issues.apache.org/jira/browse/HADOOP-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esther Kundin updated HADOOP-12291: --- Status: Open (was: Patch Available) Working on a fix. > Add support for nested groups in LdapGroupsMapping > -- > > Key: HADOOP-12291 > URL: https://issues.apache.org/jira/browse/HADOOP-12291 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.8.0 >Reporter: Gautam Gopalakrishnan >Assignee: Esther Kundin > Labels: features, patch > Attachments: HADOOP-12291.001.patch > > > When using {{LdapGroupsMapping}} with Hadoop, nested groups are not > supported. So for example if user {{jdoe}} is part of group A which is a > member of group B, the group mapping currently returns only group A. > Currently this facility is available with {{ShellBasedUnixGroupsMapping}} and > SSSD (or similar tools) but would be good to have this feature as part of > {{LdapGroupsMapping}} directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13047) S3a Forward seek in stream length to be configurable
[ https://issues.apache.org/jira/browse/HADOOP-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253961#comment-15253961 ] Steve Loughran commented on HADOOP-13047: - Rajesh, here's the output of my instrumented test run, showing the stats collected. These statistics will be accessible at the stream level, so can be used to test whether the extended readahead has actually worked. I also add the bytes skipped to the count of bytes read, so you can implicitly work out if a stream skipped, though it's a lot trickier than just getting the stats on {{bytesSkippedOnSeek}} {code} = TEST OUTPUT FOR o.a.s.cloud.s3.S3aIOSuite: 'SeekReadFully: Cost of seek and ReadFully' = 2016-04-22 14:46:49,509 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of stat = 220,789,000 ns 2016-04-22 14:46:49,704 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of open = 195,051,000 ns 2016-04-22 14:46:49,705 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-04-22 14:46:50,136 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of read() [pos = 0] = 430,721,000 ns 2016-04-22 14:46:50,137 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1 nextReadPos=1 contentLength=20314850 statistics{streamAborted=0, streamOpened=1, streamCloseOperations=0, backwardSeekOperations=0, streamSeekOperations=0, streamClosed=0, readExceptions=0, forwardSeekOperations=0, bytesSkippedOnSeek=0}} 2016-04-22 14:46:50,137 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-04-22 14:46:50,137 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of seek(256) [pos = 1] = 22,000 ns 2016-04-22 14:46:50,138 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1 nextReadPos=256 contentLength=20314850 statistics{streamAborted=0, streamOpened=1, streamCloseOperations=0, backwardSeekOperations=0, streamSeekOperations=0, streamClosed=0, readExceptions=0, forwardSeekOperations=0, bytesSkippedOnSeek=0}} 2016-04-22 14:46:50,138 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-04-22 14:46:50,138 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of seek(256) [pos = 256] = 17,000 ns 2016-04-22 14:46:50,139 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1 nextReadPos=256 contentLength=20314850 statistics{streamAborted=0, streamOpened=1, streamCloseOperations=0, backwardSeekOperations=0, streamSeekOperations=0, streamClosed=0, readExceptions=0, forwardSeekOperations=0, bytesSkippedOnSeek=0}} 2016-04-22 14:46:50,140 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-04-22 14:46:50,141 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of seek(EOF-2) [pos = 256] = 25,000 ns 2016-04-22 14:46:50,141 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AInputStream{s3a://landsat-pds/scene_list.gz pos=1 nextReadPos=20314848 contentLength=20314850 statistics{streamAborted=0, streamOpened=1, streamCloseOperations=0, backwardSeekOperations=0, streamSeekOperations=0, streamClosed=0, readExceptions=0, forwardSeekOperations=0, bytesSkippedOnSeek=0}} 2016-04-22 14:46:50,142 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-04-22 14:46:51,069 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of read() [pos = 20314848] = 927,211,000 ns 2016-04-22 14:46:51,070 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AInputStream{s3a://landsat-pds/scene_list.gz pos=20314849 nextReadPos=20314849 contentLength=20314850 statistics{streamAborted=1, streamOpened=2, streamCloseOperations=1, backwardSeekOperations=0, streamSeekOperations=1, streamClosed=0, readExceptions=0, forwardSeekOperations=1, bytesSkippedOnSeek=0}} 2016-04-22 14:46:51,070 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-04-22 14:46:51,460 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of readFully(1, byte[1]) [pos = 20314849] = 389,682,000 ns 2016-04-22 14:46:51,461 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AInputStream{s3a://landsat-pds/scene_list.gz pos=2 nextReadPos=20314849 contentLength=20314850 statistics{streamAborted=1, streamOpened=3, streamCloseOperations=2, backwardSeekOperations=1, streamSeekOperations=2, streamClosed=1, readExceptions=0, forwardSeekOperations=1, bytesSkippedOnSeek=0}} 2016-04-22 14:46:51,461 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - 2016-04-22 14:46:52,583 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - Duration of readFully(1, byte[256]) [pos = 20314849] = 1,121,839,000 ns 2016-04-22 14:46:52,583 INFO s3.S3aIOSuite (Logging.scala:logInfo(54)) - S3AInputStream{s3a://landsat-pds/scene_list.gz pos=257 nextReadPos=20314849 contentLength=20314850 statistics{streamAborted=2, streamOpened=4, streamCloseOperations=3, backwardSeekOperations=2, streamSeekOperations=3, streamClosed=1,
[jira] [Commented] (HADOOP-13047) S3a Forward seek in stream length to be configurable
[ https://issues.apache.org/jira/browse/HADOOP-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253934#comment-15253934 ] Steve Loughran commented on HADOOP-13047: - thanks. I'm doing the low-level metrics patch right now, even though the two patches address different things, they wont merge cleanly. Just a warning... If you look at SwiftNativeInputStream, it reads ahead an arbitrary number of bytes; we could do something like that > S3a Forward seek in stream length to be configurable > > > Key: HADOOP-13047 > URL: https://issues.apache.org/jira/browse/HADOOP-13047 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Steve Loughran > Attachments: HADOOP-13047.WIP.patch > > > Even with lazy seek, tests can show that sometimes a short-distance forward > seek is triggering a close + reopen, because the threshold for the seek is > simply available bytes in the inner stream. > A configurable threshold would allow data to be read and discarded before > that seek. This should be beneficial over long-haul networks as the time to > set up the TCP channel is high, and TCP-slow-start means that the ramp up of > bandwidth is slow. In such deployments, it will better to read forward than > re-open, though the exact "best" number will vary with client and endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12767) update apache httpclient version to the latest 4.5 for security
[ https://issues.apache.org/jira/browse/HADOOP-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253905#comment-15253905 ] Masatake Iwasaki commented on HADOOP-12767: --- [~artem.aliev] and [~jojochuang], can you address the checkstyle warnings and deprecation warnings? {noformat} {noformat} {noformat} [WARNING] .../org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java:[66,37] org.apache.http.client.params.ClientPNames in org.apache.http.client.params has been deprecated [WARNING] .../org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java:[67,37] org.apache.http.client.params.CookiePolicy in org.apache.http.client.params has been deprecated [WARNING] .../org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java:[69,35] org.apache.http.conn.params.ConnRoutePNames in org.apache.http.conn.params has been deprecated [WARNING] .../org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java:[71,35] org.apache.http.impl.client.DefaultHttpClient in org.apache.http.impl.client has been deprecated [WARNING] .../org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java:[184,5] org.apache.http.impl.client.DefaultHttpClient in org.apache.http.impl.client has been deprecated [WARNING] .../org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java:[184,36] org.apache.http.impl.client.DefaultHttpClient in org.apache.http.impl.client has been deprecated [WARNING] .../org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java:[189,30] org.apache.http.client.params.ClientPNames in org.apache.http.client.params has been deprecated [WARNING] .../org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java:[187,23] org.apache.http.client.params.ClientPNames in org.apache.http.client.params has been deprecated [WARNING] .../org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java:[188,13] org.apache.http.client.params.CookiePolicy in org.apache.http.client.params has been deprecated [WARNING] .../org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java:[198,23] org.apache.http.conn.params.ConnRoutePNames in org.apache.http.conn.params has been deprecated {noformat} > update apache httpclient version to the latest 4.5 for security > --- > > Key: HADOOP-12767 > URL: https://issues.apache.org/jira/browse/HADOOP-12767 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 2.7.2 >Reporter: Artem Aliev >Assignee: Artem Aliev > Attachments: HADOOP-12767-branch-2.004.patch, HADOOP-12767.001.patch, > HADOOP-12767.002.patch, HADOOP-12767.003.patch, HADOOP-12767.004.patch > > > Various SSL security fixes are needed. See: CVE-2012-6153, CVE-2011-4461, > CVE-2014-3577, CVE-2015-5262. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13047) S3a Forward seek in stream length to be configurable
[ https://issues.apache.org/jira/browse/HADOOP-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HADOOP-13047: -- Attachment: HADOOP-13047.WIP.patch Attaching the high level WIP patch. Based on the gathered statistics on the amount of data read so far and the time taken to connect, it should be possible to determine whether to establish a new connection or to read from existing stream itself (like the case you had pointed earlier). WIP tries to address this scenario. It might not be possible to use something like ReadAheadPool in hadoop directly as that is based on FileDescriptor. > S3a Forward seek in stream length to be configurable > > > Key: HADOOP-13047 > URL: https://issues.apache.org/jira/browse/HADOOP-13047 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Steve Loughran > Attachments: HADOOP-13047.WIP.patch > > > Even with lazy seek, tests can show that sometimes a short-distance forward > seek is triggering a close + reopen, because the threshold for the seek is > simply available bytes in the inner stream. > A configurable threshold would allow data to be read and discarded before > that seek. This should be beneficial over long-haul networks as the time to > set up the TCP channel is high, and TCP-slow-start means that the ramp up of > bandwidth is slow. In such deployments, it will better to read forward than > re-open, though the exact "best" number will vary with client and endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13051) Test for special characters in path being respected during globPaths
[ https://issues.apache.org/jira/browse/HADOOP-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HADOOP-13051: - Attachment: HDFS-13051.000.patch Patch that fails in {{branch-2}} but passes in trunk after the mentioned fix. > Test for special characters in path being respected during globPaths > > > Key: HADOOP-13051 > URL: https://issues.apache.org/jira/browse/HADOOP-13051 > Project: Hadoop Common > Issue Type: Test > Components: fs >Affects Versions: 3.0.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > Attachments: HDFS-13051.000.patch > > > On {{branch-2}}, the below is the (incorrect) behaviour today, where paths > with special characters get dropped during globStatus calls: > {code} > bin/hdfs dfs -mkdir /foo > bin/hdfs dfs -touchz /foo/foo1 > bin/hdfs dfs -touchz $'/foo/foo1\r' > bin/hdfs dfs -ls '/foo/*' > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1 > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1^M > bin/hdfs dfs -ls '/foo/*' > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1 > {code} > Whereas trunk has the right behaviour, subtly fixed via the pattern library > change of HADOOP-12436: > {code} > bin/hdfs dfs -mkdir /foo > bin/hdfs dfs -touchz /foo/foo1 > bin/hdfs dfs -touchz $'/foo/foo1\r' > bin/hdfs dfs -ls '/foo/*' > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1 > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1^M > bin/hdfs dfs -ls '/foo/*' > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1 > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1^M > {code} > (I've placed a ^M explicitly to indicate presence of the intentional hidden > character) > We should still add a simple test-case to cover this situation for future > regressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13051) Test for special characters in path being respected during globPaths
[ https://issues.apache.org/jira/browse/HADOOP-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HADOOP-13051: - Status: Patch Available (was: Open) > Test for special characters in path being respected during globPaths > > > Key: HADOOP-13051 > URL: https://issues.apache.org/jira/browse/HADOOP-13051 > Project: Hadoop Common > Issue Type: Test > Components: fs >Affects Versions: 3.0.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > Attachments: HDFS-13051.000.patch > > > On {{branch-2}}, the below is the (incorrect) behaviour today, where paths > with special characters get dropped during globStatus calls: > {code} > bin/hdfs dfs -mkdir /foo > bin/hdfs dfs -touchz /foo/foo1 > bin/hdfs dfs -touchz $'/foo/foo1\r' > bin/hdfs dfs -ls '/foo/*' > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1 > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1^M > bin/hdfs dfs -ls '/foo/*' > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1 > {code} > Whereas trunk has the right behaviour, subtly fixed via the pattern library > change of HADOOP-12436: > {code} > bin/hdfs dfs -mkdir /foo > bin/hdfs dfs -touchz /foo/foo1 > bin/hdfs dfs -touchz $'/foo/foo1\r' > bin/hdfs dfs -ls '/foo/*' > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1 > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1^M > bin/hdfs dfs -ls '/foo/*' > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1 > -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1^M > {code} > (I've placed a ^M explicitly to indicate presence of the intentional hidden > character) > We should still add a simple test-case to cover this situation for future > regressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-13051) Test for special characters in path being respected during globPaths
Harsh J created HADOOP-13051: Summary: Test for special characters in path being respected during globPaths Key: HADOOP-13051 URL: https://issues.apache.org/jira/browse/HADOOP-13051 Project: Hadoop Common Issue Type: Test Components: fs Affects Versions: 3.0.0 Reporter: Harsh J Assignee: Harsh J Priority: Minor On {{branch-2}}, the below is the (incorrect) behaviour today, where paths with special characters get dropped during globStatus calls: {code} bin/hdfs dfs -mkdir /foo bin/hdfs dfs -touchz /foo/foo1 bin/hdfs dfs -touchz $'/foo/foo1\r' bin/hdfs dfs -ls '/foo/*' -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1 -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1^M bin/hdfs dfs -ls '/foo/*' -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1 {code} Whereas trunk has the right behaviour, subtly fixed via the pattern library change of HADOOP-12436: {code} bin/hdfs dfs -mkdir /foo bin/hdfs dfs -touchz /foo/foo1 bin/hdfs dfs -touchz $'/foo/foo1\r' bin/hdfs dfs -ls '/foo/*' -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1 -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1^M bin/hdfs dfs -ls '/foo/*' -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1 -rw-r--r-- 3 harsh supergroup 0 2016-04-22 17:35 /foo/foo1^M {code} (I've placed a ^M explicitly to indicate presence of the intentional hidden character) We should still add a simple test-case to cover this situation for future regressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12844) Recover when S3A fails on IOException in read()
[ https://issues.apache.org/jira/browse/HADOOP-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253785#comment-15253785 ] Steve Loughran commented on HADOOP-12844: - I've taken this over. It's actually simpler than this patch, as all that's needed is the existing clause catching some socket exceptions to be expanded to catch any IOE, and then log and retry ... this is common code which can be shared in both read operations. what is important is to catch and respond to EOF exceptions before the generic IOE clause > Recover when S3A fails on IOException in read() > --- > > Key: HADOOP-12844 > URL: https://issues.apache.org/jira/browse/HADOOP-12844 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.7.1, 2.7.2 >Reporter: Pieter Reuse >Assignee: Pieter Reuse > Attachments: HADOOP-12844.001.patch > > > This simple patch catches IOExceptions in S3AInputStream.read(byte[] buf, int > off, int len) and reopens the connection on the same location as it was > before the exception. > This is similar to the functionality introduced in S3N in > [HADOOP-6254|https://issues.apache.org/jira/browse/HADOOP-6254], for exactly > the same reason. > Patch developed in cooperation with [~emres]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12844) Recover when S3A fails on IOException in read()
[ https://issues.apache.org/jira/browse/HADOOP-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-12844: Summary: Recover when S3A fails on IOException in read() (was: Recover when S3A fails on IOException) > Recover when S3A fails on IOException in read() > --- > > Key: HADOOP-12844 > URL: https://issues.apache.org/jira/browse/HADOOP-12844 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.7.1, 2.7.2 >Reporter: Pieter Reuse >Assignee: Pieter Reuse > Attachments: HADOOP-12844.001.patch > > > This simple patch catches IOExceptions in S3AInputStream.read(byte[] buf, int > off, int len) and reopens the connection on the same location as it was > before the exception. > This is similar to the functionality introduced in S3N in > [HADOOP-6254|https://issues.apache.org/jira/browse/HADOOP-6254], for exactly > the same reason. > Patch developed in cooperation with [~emres]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13044) Amazon S3 library 10.10.60+ (JDK8u60+) depends on http components 4.3
[ https://issues.apache.org/jira/browse/HADOOP-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-13044: Summary: Amazon S3 library 10.10.60+ (JDK8u60+) depends on http components 4.3 (was: Amazon S3 library 10.10 (JDK8u60+) depends on http components 4.3) > Amazon S3 library 10.10.60+ (JDK8u60+) depends on http components 4.3 > - > > Key: HADOOP-13044 > URL: https://issues.apache.org/jira/browse/HADOOP-13044 > Project: Hadoop Common > Issue Type: Improvement > Components: build, fs/s3 >Affects Versions: 2.8.0 > Environment: JDK 8u60 >Reporter: Kai Sasaki >Assignee: Kai Sasaki > Attachments: HADOOP-13044.01.patch > > > In case of using AWS SDK in the classpath of hadoop, we faced an issue caused > by incompatiblity of AWS SDK and httpcomponents. > {code} > java.lang.NoSuchFieldError: INSTANCE > at > com.amazonaws.http.conn.SdkConnectionKeepAliveStrategy.getKeepAliveDuration(SdkConnectionKeepAliveStrategy.java:48) > at > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:535) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) > {code} > The latest AWS SDK depends on 4.3.x which has > [DefaultConnectionKeepAliveStrategy.INSTANCE|http://hc.apache.org/httpcomponents-client-4.3.x/httpclient/apidocs/org/apache/http/impl/client/DefaultConnectionKeepAliveStrategy.html#INSTANCE]. > This field is introduced from 4.3. > This will allow us to avoid {{CLASSPATH}} confliction around httpclient > versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HADOOP-13009) add option for lazy open() on s3a
[ https://issues.apache.org/jira/browse/HADOOP-13009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-13009. - Resolution: Invalid Fix Version/s: 2.8.0 S3A doesn't open the input stream; hasn't ever. There's a getFileStatus(), which is needed to determine content length and fail if the file is missing...this is one HTTP connect which is then shut down > add option for lazy open() on s3a > - > > Key: HADOOP-13009 > URL: https://issues.apache.org/jira/browse/HADOOP-13009 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Steve Loughran > Fix For: 2.8.0 > > > After lazy-seek, I want to add a —very much non-default —lazy-open option. > If you look at a trace of what goes on with object store access, there's > usually a GET at offset 0 (the {{open()}} command, followed by a {{seek()}}. > If there was a lazy option option, then {{open()}} would set up the instance > for reading, but not actually talk to the object store —it'd be the first > seek or read which would hit the service. You'd eliminate one HTTP operation > from a read sequence, for a faster startup time, especially long-haul. > That's a big break in the normal assumption: if a file isn't there, > {{open()}} fails, so it'd only work with apps which did open+read, open+seek, > or opened+positioned readable action back to back. By making it an option > people can experiment to see what happens —though full testing would need to > do some fault injection on the first seek/read to see how code handled late > failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12891) S3AFileSystem should configure Multipart Copy threshold and chunk size
[ https://issues.apache.org/jira/browse/HADOOP-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253720#comment-15253720 ] Hudson commented on HADOOP-12891: - FAILURE: Integrated in Hadoop-trunk-Commit #9653 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9653/]) HADOOP-12891. S3AFileSystem should configure Multipart Copy threshold (stevel: rev 19f0f9608e31203523943f008ac701b6f3d7973c) * hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md * hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java * hadoop-common-project/hadoop-common/src/main/resources/core-default.xml > S3AFileSystem should configure Multipart Copy threshold and chunk size > -- > > Key: HADOOP-12891 > URL: https://issues.apache.org/jira/browse/HADOOP-12891 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 2.7.2 >Reporter: Andrew Olson >Assignee: Andrew Olson > Fix For: 2.8.0 > > Attachments: HADOOP-12891-001.patch, HADOOP-12891-002.patch > > > In the AWS S3 Java SDK the defaults for Multipart Copy threshold and chunk > size are very high [1], > {noformat} > /** Default size threshold for Amazon S3 object after which multi-part > copy is initiated. */ > private static final long DEFAULT_MULTIPART_COPY_THRESHOLD = 5 * GB; > /** Default minimum size of each part for multi-part copy. */ > private static final long DEFAULT_MINIMUM_COPY_PART_SIZE = 100 * MB; > {noformat} > In internal testing we have found that a lower but still reasonable threshold > and chunk size can be extremely beneficial. In our case we set both the > threshold and size to 25 MB with good results. > Amazon enforces a minimum of 5 MB [2]. > For the S3A filesystem, file renames are actually implemented via a remote > copy request, which is already quite slow compared to a rename on HDFS. This > very high threshold for utilizing the multipart functionality can make the > performance considerably worse, particularly for files in the 100MB to 5GB > range which is fairly typical for mapreduce job outputs. > Two apparent options are: > 1) Use the same configuration ({{fs.s3a.multipart.threshold}}, > {{fs.s3a.multipart.size}}) for both. This seems preferable as the > accompanying documentation [3] for these configuration properties actually > already says that they are applicable for either "uploads or copies". We just > need to add in the missing > {{TransferManagerConfiguration#setMultipartCopyThreshold}} [4] and > {{TransferManagerConfiguration#setMultipartCopyPartSize}} [5] calls at [6] > like: > {noformat} > /* Handle copies in the same way as uploads. */ > transferConfiguration.setMultipartCopyPartSize(partSize); > transferConfiguration.setMultipartCopyThreshold(multiPartThreshold); > {noformat} > 2) Add two new configuration properties so that the copy threshold and part > size can be independently configured, maybe change the defaults to be lower > than Amazon's, set into {{TransferManagerConfiguration}} in the same way. > In any case at a minimum if neither of the above options are acceptable > changes the config documentation should be adjusted to match the code, noting > that {{fs.s3a.multipart.threshold}} and {{fs.s3a.multipart.size}} are > applicable to uploads of new objects only and not copies (i.e. renaming > objects). > [1] > https://github.com/aws/aws-sdk-java/blob/1.10.58/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/transfer/TransferManagerConfiguration.java#L36-L40 > [2] http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPartCopy.html > [3] > https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#S3A > [4] > http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManagerConfiguration.html#setMultipartCopyThreshold(long) > [5] > http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManagerConfiguration.html#setMultipartCopyPartSize(long) > [6] > https://github.com/apache/hadoop/blob/release-2.7.2-RC2/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L286 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12751) While using kerberos Hadoop incorrectly assumes names with '@' to be non-simple
[ https://issues.apache.org/jira/browse/HADOOP-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-12751: Status: Patch Available (was: Open) resubmitting the patch. Bolke: if you want isolation on local dev & test, create a linux (or even windows) VM. I have both, including a VM with Kerberos enabled, that being the first step to testing my code in a secure env > While using kerberos Hadoop incorrectly assumes names with '@' to be > non-simple > --- > > Key: HADOOP-12751 > URL: https://issues.apache.org/jira/browse/HADOOP-12751 > Project: Hadoop Common > Issue Type: Bug > Components: security >Affects Versions: 2.7.2 > Environment: kerberos >Reporter: Bolke de Bruin >Assignee: Bolke de Bruin >Priority: Critical > Labels: kerberos > Attachments: 0001-HADOOP-12751-leave-user-validation-to-os.patch, > 0001-Remove-check-for-user-name-characters-and.patch, > 0002-HADOOP-12751-leave-user-validation-to-os.patch, > 0003-HADOOP-12751-leave-user-validation-to-os.patch, > 0004-HADOOP-12751-leave-user-validation-to-os.patch, > 0005-HADOOP-12751-leave-user-validation-to-os.patch, > 0006-HADOOP-12751-leave-user-validation-to-os.patch, > 0007-HADOOP-12751-leave-user-validation-to-os.patch, > 0007-HADOOP-12751-leave-user-validation-to-os.patch, > 0008-HADOOP-12751-leave-user-validation-to-os.patch > > > In the scenario of a trust between two directories, eg. FreeIPA (ipa.local) > and Active Directory (ad.local) users can be made available on the OS level > by something like sssd. The trusted users will be of the form 'user@ad.local' > while other users are will not contain the domain. Executing 'id -Gn > user@ad.local' will successfully return the groups the user belongs to if > configured correctly. > However, it is assumed by Hadoop that users of the format with '@' cannot be > correct. This code is in KerberosName.java and seems to be a validator if the > 'auth_to_local' rules are applied correctly. > In my opinion this should be removed or changed to a different kind of check > or maybe logged as a warning while still proceeding, as the current behavior > limits integration possibilities with other standard tools. > Workaround are difficult to apply (by having a rewrite by system tools to for > example user_ad_local) due to down stream consequences. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12891) S3AFileSystem should configure Multipart Copy threshold and chunk size
[ https://issues.apache.org/jira/browse/HADOOP-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-12891: Resolution: Fixed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) +1 committed —thanks for your contribution here Andrew. Please check out and build the 2.8 branch and make sure all works well; we've been doing some other changes too —finding problems early would be invaluable to us > S3AFileSystem should configure Multipart Copy threshold and chunk size > -- > > Key: HADOOP-12891 > URL: https://issues.apache.org/jira/browse/HADOOP-12891 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 2.7.2 >Reporter: Andrew Olson >Assignee: Andrew Olson > Fix For: 2.8.0 > > Attachments: HADOOP-12891-001.patch, HADOOP-12891-002.patch > > > In the AWS S3 Java SDK the defaults for Multipart Copy threshold and chunk > size are very high [1], > {noformat} > /** Default size threshold for Amazon S3 object after which multi-part > copy is initiated. */ > private static final long DEFAULT_MULTIPART_COPY_THRESHOLD = 5 * GB; > /** Default minimum size of each part for multi-part copy. */ > private static final long DEFAULT_MINIMUM_COPY_PART_SIZE = 100 * MB; > {noformat} > In internal testing we have found that a lower but still reasonable threshold > and chunk size can be extremely beneficial. In our case we set both the > threshold and size to 25 MB with good results. > Amazon enforces a minimum of 5 MB [2]. > For the S3A filesystem, file renames are actually implemented via a remote > copy request, which is already quite slow compared to a rename on HDFS. This > very high threshold for utilizing the multipart functionality can make the > performance considerably worse, particularly for files in the 100MB to 5GB > range which is fairly typical for mapreduce job outputs. > Two apparent options are: > 1) Use the same configuration ({{fs.s3a.multipart.threshold}}, > {{fs.s3a.multipart.size}}) for both. This seems preferable as the > accompanying documentation [3] for these configuration properties actually > already says that they are applicable for either "uploads or copies". We just > need to add in the missing > {{TransferManagerConfiguration#setMultipartCopyThreshold}} [4] and > {{TransferManagerConfiguration#setMultipartCopyPartSize}} [5] calls at [6] > like: > {noformat} > /* Handle copies in the same way as uploads. */ > transferConfiguration.setMultipartCopyPartSize(partSize); > transferConfiguration.setMultipartCopyThreshold(multiPartThreshold); > {noformat} > 2) Add two new configuration properties so that the copy threshold and part > size can be independently configured, maybe change the defaults to be lower > than Amazon's, set into {{TransferManagerConfiguration}} in the same way. > In any case at a minimum if neither of the above options are acceptable > changes the config documentation should be adjusted to match the code, noting > that {{fs.s3a.multipart.threshold}} and {{fs.s3a.multipart.size}} are > applicable to uploads of new objects only and not copies (i.e. renaming > objects). > [1] > https://github.com/aws/aws-sdk-java/blob/1.10.58/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/transfer/TransferManagerConfiguration.java#L36-L40 > [2] http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPartCopy.html > [3] > https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#S3A > [4] > http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManagerConfiguration.html#setMultipartCopyThreshold(long) > [5] > http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManagerConfiguration.html#setMultipartCopyPartSize(long) > [6] > https://github.com/apache/hadoop/blob/release-2.7.2-RC2/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L286 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13044) Amazon S3 library 10.10 (JDK8u60+) depends on http components 4.3
[ https://issues.apache.org/jira/browse/HADOOP-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253634#comment-15253634 ] Steve Loughran commented on HADOOP-13044: - Looks like 10.10.6 needs 2.3.6 too. I was thinking "maybe we could just bump up Joda time", but its possible that the httpclient problem already exists ... it just hasn't surfaced yet {code} [INFO] +- com.amazonaws:aws-java-sdk-s3:jar:1.10.6:compile [INFO] | +- com.amazonaws:aws-java-sdk-kms:jar:1.10.6:compile [INFO] | | \- (com.amazonaws:aws-java-sdk-core:jar:1.10.6:compile - omitted for duplicate) [INFO] | \- com.amazonaws:aws-java-sdk-core:jar:1.10.6:compile [INFO] | +- (commons-logging:commons-logging:jar:1.1.3:compile - version managed from 1.1.1; omitted for duplicate) [INFO] | +- (org.apache.httpcomponents:httpclient:jar:4.2.5:compile - version managed from 4.3.6; omitted for duplicate) [INFO] | +- (com.fasterxml.jackson.core:jackson-databind:jar:2.2.3:compile - version managed from 2.5.3; omitted for duplicate) [INFO] | \- joda-time:joda-time:jar:2.8.1:compile {code} > Amazon S3 library 10.10 (JDK8u60+) depends on http components 4.3 > - > > Key: HADOOP-13044 > URL: https://issues.apache.org/jira/browse/HADOOP-13044 > Project: Hadoop Common > Issue Type: Improvement > Components: build, fs/s3 >Affects Versions: 2.8.0 > Environment: JDK 8u60 >Reporter: Kai Sasaki >Assignee: Kai Sasaki > Attachments: HADOOP-13044.01.patch > > > In case of using AWS SDK in the classpath of hadoop, we faced an issue caused > by incompatiblity of AWS SDK and httpcomponents. > {code} > java.lang.NoSuchFieldError: INSTANCE > at > com.amazonaws.http.conn.SdkConnectionKeepAliveStrategy.getKeepAliveDuration(SdkConnectionKeepAliveStrategy.java:48) > at > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:535) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) > {code} > The latest AWS SDK depends on 4.3.x which has > [DefaultConnectionKeepAliveStrategy.INSTANCE|http://hc.apache.org/httpcomponents-client-4.3.x/httpclient/apidocs/org/apache/http/impl/client/DefaultConnectionKeepAliveStrategy.html#INSTANCE]. > This field is introduced from 4.3. > This will allow us to avoid {{CLASSPATH}} confliction around httpclient > versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-13050) Upgrade to AWS SDK 10.10+ for Java 8u60+
Steve Loughran created HADOOP-13050: --- Summary: Upgrade to AWS SDK 10.10+ for Java 8u60+ Key: HADOOP-13050 URL: https://issues.apache.org/jira/browse/HADOOP-13050 Project: Hadoop Common Issue Type: Sub-task Components: build, fs/s3 Affects Versions: 2.7.2 Reporter: Steve Loughran HADOOP-13044 highlights that AWS SDK 10.6 —shipping in Hadoop 2.7+, doesn't work on open jdk >= 8u60, because a change in the JDK broke the version of Joda time that AWS uses. Fix, update the JDK. Though, that implies updating http components: HADOOP-12767. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12563) Updated utility to create/modify token files
[ https://issues.apache.org/jira/browse/HADOOP-12563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253624#comment-15253624 ] Bibin A Chundatt commented on HADOOP-12563: --- For cases when {{secretKeysMap}} and {{tokenMap}} is empty the tests are failing. Will {{writeDelimitedTo}} during write and {{parseDelimitedFrom}} during read help? > Updated utility to create/modify token files > > > Key: HADOOP-12563 > URL: https://issues.apache.org/jira/browse/HADOOP-12563 > Project: Hadoop Common > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Matthew Paduano > Fix For: 3.0.0 > > Attachments: HADOOP-12563.01.patch, HADOOP-12563.02.patch, > HADOOP-12563.03.patch, HADOOP-12563.04.patch, HADOOP-12563.05.patch, > HADOOP-12563.06.patch, HADOOP-12563.07.patch, HADOOP-12563.07.patch, > HADOOP-12563.08.patch, HADOOP-12563.09.patch, HADOOP-12563.10.patch, > HADOOP-12563.11.patch, HADOOP-12563.12.patch, HADOOP-12563.13.patch, > dtutil-test-out, example_dtutil_commands_and_output.txt, > generalized_token_case.pdf > > > hdfs fetchdt is missing some critical features and is geared almost > exclusively towards HDFS operations. Additionally, the token files that are > created use Java serializations which are hard/impossible to deal with in > other languages. It should be replaced with a better utility in common that > can read/write protobuf-based token files, has enough flexibility to be used > with other services, and offers key functionality such as append and rename. > The old version file format should still be supported for backward > compatibility, but will be effectively deprecated. > A follow-on JIRA will deprecrate fetchdt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12563) Updated utility to create/modify token files
[ https://issues.apache.org/jira/browse/HADOOP-12563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253607#comment-15253607 ] Steve Loughran commented on HADOOP-12563: - OK what to do? Ravi, Matt: can you look at this today? Otherwise it should be rolled back to get jenkins happy and then resubmitted next week > Updated utility to create/modify token files > > > Key: HADOOP-12563 > URL: https://issues.apache.org/jira/browse/HADOOP-12563 > Project: Hadoop Common > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Allen Wittenauer >Assignee: Matthew Paduano > Fix For: 3.0.0 > > Attachments: HADOOP-12563.01.patch, HADOOP-12563.02.patch, > HADOOP-12563.03.patch, HADOOP-12563.04.patch, HADOOP-12563.05.patch, > HADOOP-12563.06.patch, HADOOP-12563.07.patch, HADOOP-12563.07.patch, > HADOOP-12563.08.patch, HADOOP-12563.09.patch, HADOOP-12563.10.patch, > HADOOP-12563.11.patch, HADOOP-12563.12.patch, HADOOP-12563.13.patch, > dtutil-test-out, example_dtutil_commands_and_output.txt, > generalized_token_case.pdf > > > hdfs fetchdt is missing some critical features and is geared almost > exclusively towards HDFS operations. Additionally, the token files that are > created use Java serializations which are hard/impossible to deal with in > other languages. It should be replaced with a better utility in common that > can read/write protobuf-based token files, has enough flexibility to be used > with other services, and offers key functionality such as append and rename. > The old version file format should still be supported for backward > compatibility, but will be effectively deprecated. > A follow-on JIRA will deprecrate fetchdt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13018) Make Kdiag check whether hadoop.token.files points to existent and valid files
[ https://issues.apache.org/jira/browse/HADOOP-13018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253598#comment-15253598 ] Steve Loughran commented on HADOOP-13018: - # you can use {{verifyFileIsValid}} is to validate the existence/file-ness of the token file; it will return true if the code should carry on on with the validation # if there is problems with the file, should the validation just log or exit? The {{verify()}} method makes that decision for conditions; there isn't an equivalent for exception catch & rethrow ... this might be time. to add that. I know that the UGI stuff should now fail meaningfully if there are problems with the token file ... if we are confident that this holds then maybe your policy: log-but-continue is the right one. Testing: how about setting things up with an invalid path to the token file and see how kdiag reacts? > Make Kdiag check whether hadoop.token.files points to existent and valid files > -- > > Key: HADOOP-13018 > URL: https://issues.apache.org/jira/browse/HADOOP-13018 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HADOOP-13018.01.patch > > > Steve proposed that KDiag should fail fast to help debug the case that > hadoop.token.files points to a file not found. This JIRA is to affect that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13044) Amazon S3 library 10.10 (JDK8u60+) depends on http components 4.3
[ https://issues.apache.org/jira/browse/HADOOP-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-13044: Summary: Amazon S3 library 10.10 (JDK8u60+) depends on http components 4.3 (was: Amazon S3 library depends on http components 4.3) > Amazon S3 library 10.10 (JDK8u60+) depends on http components 4.3 > - > > Key: HADOOP-13044 > URL: https://issues.apache.org/jira/browse/HADOOP-13044 > Project: Hadoop Common > Issue Type: Improvement > Components: build, fs/s3 >Affects Versions: 2.8.0 >Reporter: Kai Sasaki >Assignee: Kai Sasaki > Attachments: HADOOP-13044.01.patch > > > In case of using AWS SDK in the classpath of hadoop, we faced an issue caused > by incompatiblity of AWS SDK and httpcomponents. > {code} > java.lang.NoSuchFieldError: INSTANCE > at > com.amazonaws.http.conn.SdkConnectionKeepAliveStrategy.getKeepAliveDuration(SdkConnectionKeepAliveStrategy.java:48) > at > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:535) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) > {code} > The latest AWS SDK depends on 4.3.x which has > [DefaultConnectionKeepAliveStrategy.INSTANCE|http://hc.apache.org/httpcomponents-client-4.3.x/httpclient/apidocs/org/apache/http/impl/client/DefaultConnectionKeepAliveStrategy.html#INSTANCE]. > This field is introduced from 4.3. > This will allow us to avoid {{CLASSPATH}} confliction around httpclient > versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-13044) Amazon S3 library depends on http components 4.3
[ https://issues.apache.org/jira/browse/HADOOP-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253586#comment-15253586 ] Steve Loughran commented on HADOOP-13044: - lets keep this as separate, changing the title makes clear why an upgrade is needed. I assume its a 1.10 problem, as if it were 10.6 I'd have expected a failure to surface already As for the java 8 thing, thanks for pointing out. Looks like JDK 8u60 broke Joda time, transitively AWS. > Amazon S3 library depends on http components 4.3 > > > Key: HADOOP-13044 > URL: https://issues.apache.org/jira/browse/HADOOP-13044 > Project: Hadoop Common > Issue Type: Improvement > Components: build, fs/s3 >Affects Versions: 2.8.0 >Reporter: Kai Sasaki >Assignee: Kai Sasaki > Attachments: HADOOP-13044.01.patch > > > In case of using AWS SDK in the classpath of hadoop, we faced an issue caused > by incompatiblity of AWS SDK and httpcomponents. > {code} > java.lang.NoSuchFieldError: INSTANCE > at > com.amazonaws.http.conn.SdkConnectionKeepAliveStrategy.getKeepAliveDuration(SdkConnectionKeepAliveStrategy.java:48) > at > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:535) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) > {code} > The latest AWS SDK depends on 4.3.x which has > [DefaultConnectionKeepAliveStrategy.INSTANCE|http://hc.apache.org/httpcomponents-client-4.3.x/httpclient/apidocs/org/apache/http/impl/client/DefaultConnectionKeepAliveStrategy.html#INSTANCE]. > This field is introduced from 4.3. > This will allow us to avoid {{CLASSPATH}} confliction around httpclient > versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13044) Amazon S3 library 10.10 (JDK8u60+) depends on http components 4.3
[ https://issues.apache.org/jira/browse/HADOOP-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-13044: Environment: JDK 8u60 > Amazon S3 library 10.10 (JDK8u60+) depends on http components 4.3 > - > > Key: HADOOP-13044 > URL: https://issues.apache.org/jira/browse/HADOOP-13044 > Project: Hadoop Common > Issue Type: Improvement > Components: build, fs/s3 >Affects Versions: 2.8.0 > Environment: JDK 8u60 >Reporter: Kai Sasaki >Assignee: Kai Sasaki > Attachments: HADOOP-13044.01.patch > > > In case of using AWS SDK in the classpath of hadoop, we faced an issue caused > by incompatiblity of AWS SDK and httpcomponents. > {code} > java.lang.NoSuchFieldError: INSTANCE > at > com.amazonaws.http.conn.SdkConnectionKeepAliveStrategy.getKeepAliveDuration(SdkConnectionKeepAliveStrategy.java:48) > at > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:535) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) > {code} > The latest AWS SDK depends on 4.3.x which has > [DefaultConnectionKeepAliveStrategy.INSTANCE|http://hc.apache.org/httpcomponents-client-4.3.x/httpclient/apidocs/org/apache/http/impl/client/DefaultConnectionKeepAliveStrategy.html#INSTANCE]. > This field is introduced from 4.3. > This will allow us to avoid {{CLASSPATH}} confliction around httpclient > versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12738) Create unit test to automatically compare Common related classes and core-default.xml
[ https://issues.apache.org/jira/browse/HADOOP-12738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15253525#comment-15253525 ] Masatake Iwasaki commented on HADOOP-12738: --- Thanks for working on this, [~rchiang]. I will review the patch tonight (in JST). > Create unit test to automatically compare Common related classes and > core-default.xml > - > > Key: HADOOP-12738 > URL: https://issues.apache.org/jira/browse/HADOOP-12738 > Project: Hadoop Common > Issue Type: Test >Affects Versions: 2.7.1 >Reporter: Ray Chiang >Assignee: Ray Chiang >Priority: Minor > Labels: supportability > Attachments: HADOOP-12738.001.patch, HADOOP-12738.002.patch > > > Create a unit test that will automatically compare the fields in the various > Common related classes and core-default.xml. It should throw an error if a > property is missing in either the class or the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-13049) Fix the TestFailures After HADOOP-12563
[ https://issues.apache.org/jira/browse/HADOOP-13049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HADOOP-13049: --- Fix Version/s: (was: 3.0.0) > Fix the TestFailures After HADOOP-12563 > --- > > Key: HADOOP-13049 > URL: https://issues.apache.org/jira/browse/HADOOP-13049 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > > Following test fails after this in.. > TestRMContainerAllocator.testRMContainerAllocatorResendsRequestsOnRMRestart:2535 > » IllegalState > TestContainerManagerRecovery.testApplicationRecovery:189->startContainer:511 > » IllegalState > TestContainerManagerRecovery.testContainerCleanupOnShutdown:412->startContainer:511 > » IllegalState > TestContainerManagerRecovery.testContainerResizeRecovery:351->startContainer:511 > » IllegalState > See https://builds.apache.org/job/Hadoop-Yarn-trunk/2051/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)