[jira] [Updated] (HADOOP-14977) Xenial dockerfile needs ant in main build for findbugs
[ https://issues.apache.org/jira/browse/HADOOP-14977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-14977: -- Description: findbugs doesn't work without ant installed, for whatever reason: {code} [warning] /usr/bin/setBugDatabaseInfo: Unable to locate ant in /usr/share/java [warning] /usr/bin/convertXmlToText: Unable to locate ant in /usr/share/java [warning] /usr/bin/filterBugs: Unable to locate ant in /usr/share/java {code} was: findbugs doesn't work without ant installed, for whatever reason: {code} [warning] /usr/bin/setBugDatabaseInfo: Unable to locate ant in /usr/share/java [warning] /usr/bin/convertXmlToText: Unable to locate ant in /usr/share/java {code} > Xenial dockerfile needs ant in main build for findbugs > -- > > Key: HADOOP-14977 > URL: https://issues.apache.org/jira/browse/HADOOP-14977 > Project: Hadoop Common > Issue Type: Bug > Components: build >Reporter: Allen Wittenauer > > findbugs doesn't work without ant installed, for whatever reason: > {code} > [warning] /usr/bin/setBugDatabaseInfo: Unable to locate ant in /usr/share/java > [warning] /usr/bin/convertXmlToText: Unable to locate ant in /usr/share/java > [warning] /usr/bin/filterBugs: Unable to locate ant in /usr/share/java > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14977) Xenial dockerfile needs ant in main build for findbugs
[ https://issues.apache.org/jira/browse/HADOOP-14977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-14977: -- Summary: Xenial dockerfile needs ant in main build for findbugs (was: Xenial dockerfile needs ant) > Xenial dockerfile needs ant in main build for findbugs > -- > > Key: HADOOP-14977 > URL: https://issues.apache.org/jira/browse/HADOOP-14977 > Project: Hadoop Common > Issue Type: Bug > Components: build >Reporter: Allen Wittenauer > > findbugs doesn't work without ant installed, for whatever reason: > {code} > [warning] /usr/bin/setBugDatabaseInfo: Unable to locate ant in /usr/share/java > [warning] /usr/bin/convertXmlToText: Unable to locate ant in /usr/share/java > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-14977) Xenial dockerfile needs ant
Allen Wittenauer created HADOOP-14977: - Summary: Xenial dockerfile needs ant Key: HADOOP-14977 URL: https://issues.apache.org/jira/browse/HADOOP-14977 Project: Hadoop Common Issue Type: Bug Components: build Reporter: Allen Wittenauer findbugs doesn't work without ant installed, for whatever reason: {code} [warning] /usr/bin/setBugDatabaseInfo: Unable to locate ant in /usr/share/java [warning] /usr/bin/convertXmlToText: Unable to locate ant in /usr/share/java {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14840) Tool to estimate resource requirements of an application pipeline based on prior executions
[ https://issues.apache.org/jira/browse/HADOOP-14840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217910#comment-16217910 ] Sergiy Matusevych commented on HADOOP-14840: Looks good. I have a few comments about the style (using more idiomatic Java, etc.), but I will fix the code and send out the patch later after this update gets merged into the main trunk. > Tool to estimate resource requirements of an application pipeline based on > prior executions > --- > > Key: HADOOP-14840 > URL: https://issues.apache.org/jira/browse/HADOOP-14840 > Project: Hadoop Common > Issue Type: New Feature > Components: tools >Reporter: Subru Krishnan >Assignee: Rui Li >Priority: Blocker > Attachments: HADOOP-14840-v1.patch, HADOOP-14840-v2.patch, > HADOOP-14840-v3.patch, HADOOP-14840-v4.patch, ResourceEstimator-design-v1.pdf > > > We have been working on providing SLAs for job execution on Hadoop. At high > level this involves 2 parts: deriving the resource requirements of a job and > guaranteeing the estimated resources at runtime. The {{YARN > ReservationSystem}} (YARN-1051/YARN-2572/YARN-5326) enable the latter and in > this JIRA, we propose to add a tool to Hadoop to predict the resource > requirements of a job based on past executions of the job. The system (aka > *Morpheus*) deep dive can be found in our OSDI'16 paper > [here|https://www.usenix.org/conference/osdi16/technical-sessions/presentation/jyothi]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-9657) NetUtils.wrapException to have special handling for 0.0.0.0 addresses and :0 ports
[ https://issues.apache.org/jira/browse/HADOOP-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217900#comment-16217900 ] Hudson commented on HADOOP-9657: SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13131 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13131/]) HADOOP-9657. NetUtils.wrapException to have special handling for 0.0.0.0 (varunsaxena: rev 67e7673750e731f5ecfa84e82b84b7fc7ee0b233) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetUtils.java > NetUtils.wrapException to have special handling for 0.0.0.0 addresses and :0 > ports > -- > > Key: HADOOP-9657 > URL: https://issues.apache.org/jira/browse/HADOOP-9657 > Project: Hadoop Common > Issue Type: Improvement > Components: net >Affects Versions: 2.7.0 >Reporter: Steve Loughran >Assignee: Varun Saxena >Priority: Minor > Fix For: 2.9.0, 3.0.0, 3.1.0 > > Attachments: HADOOP-9657.01.patch, HADOOP-9657.02.patch > > > when an exception is wrapped, it may look like {{0.0.0.0:0 failed on > connection exception: java.net.ConnectException: Connection refused; For more > details see: http://wiki.apache.org/hadoop/ConnectionRefused}} > We should recognise all zero ip addresses and 0 ports and flag them as "your > configuration of the endpoint is wrong", as it is clearly the case -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14971) Merge S3A committers into trunk
[ https://issues.apache.org/jira/browse/HADOOP-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217896#comment-16217896 ] ASF GitHub Bot commented on HADOOP-14971: - Github user ajfabbri commented on a diff in the pull request: https://github.com/apache/hadoop/pull/282#discussion_r146723076 --- Diff: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml --- @@ -1344,34 +1338,34 @@ + - fs.s3a.retry.limit - 4 - -Number of times to retry any repeatable S3 client request on failure, -excluding throttling requests. - + fs.s3a.attempts.maximum + 20 + How many times we should retry commands on transient errors, + excluding throttling errors. --- End diff -- Interesting. One of my concerns about all the retry logic being added here is that it is an invasive change and I'm feeling like there might be unintended consequences somewhere. I've been thinking that making it more configurable would mitigate the risk.. I'd lean towards making more types/classes of retry configuration instead of fewer. For example here, I'd like to have SDK retries configured separately. I mentioned before also the idea of having another retry policy for riskier parts (e.g. delete). Thoughts? > Merge S3A committers into trunk > --- > > Key: HADOOP-14971 > URL: https://issues.apache.org/jira/browse/HADOOP-14971 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > > Merge the HADOOP-13786 committer into trunk. This branch is being set up as a > github PR for review there & to keep it out the mailboxes of the watchers on > the main JIRA -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-9657) NetUtils.wrapException to have special handling for 0.0.0.0 addresses and :0 ports
[ https://issues.apache.org/jira/browse/HADOOP-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated HADOOP-9657: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.0 3.0.0 2.9.0 Status: Resolved (was: Patch Available) Committed to trunk, branch-2 and branch-3.0 Thanks [~ste...@apache.org] for the reviews. > NetUtils.wrapException to have special handling for 0.0.0.0 addresses and :0 > ports > -- > > Key: HADOOP-9657 > URL: https://issues.apache.org/jira/browse/HADOOP-9657 > Project: Hadoop Common > Issue Type: Improvement > Components: net >Affects Versions: 2.7.0 >Reporter: Steve Loughran >Assignee: Varun Saxena >Priority: Minor > Fix For: 2.9.0, 3.0.0, 3.1.0 > > Attachments: HADOOP-9657.01.patch, HADOOP-9657.02.patch > > > when an exception is wrapped, it may look like {{0.0.0.0:0 failed on > connection exception: java.net.ConnectException: Connection refused; For more > details see: http://wiki.apache.org/hadoop/ConnectionRefused}} > We should recognise all zero ip addresses and 0 ports and flag them as "your > configuration of the endpoint is wrong", as it is clearly the case -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-9657) NetUtils.wrapException to have special handling for 0.0.0.0 addresses and :0 ports
[ https://issues.apache.org/jira/browse/HADOOP-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217733#comment-16217733 ] Varun Saxena commented on HADOOP-9657: -- Sure. Will commit it shortly. > NetUtils.wrapException to have special handling for 0.0.0.0 addresses and :0 > ports > -- > > Key: HADOOP-9657 > URL: https://issues.apache.org/jira/browse/HADOOP-9657 > Project: Hadoop Common > Issue Type: Improvement > Components: net >Affects Versions: 2.7.0 >Reporter: Steve Loughran >Assignee: Varun Saxena >Priority: Minor > Attachments: HADOOP-9657.01.patch, HADOOP-9657.02.patch > > > when an exception is wrapped, it may look like {{0.0.0.0:0 failed on > connection exception: java.net.ConnectException: Connection refused; For more > details see: http://wiki.apache.org/hadoop/ConnectionRefused}} > We should recognise all zero ip addresses and 0 ports and flag them as "your > configuration of the endpoint is wrong", as it is clearly the case -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-14976) Allow overriding HADOOP_SHELL_EXECNAME
[ https://issues.apache.org/jira/browse/HADOOP-14976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217705#comment-16217705 ] Allen Wittenauer edited comment on HADOOP-14976 at 10/24/17 9:10 PM: - bq. since the calling script always knows what is necessary? I'd need to be convinced this is true. A lot of the work done in the shell script rewrite and follow on work was to make the "front end" scripts as dumb as possible in order to centralize the program logic. This gave huge benefits in the form of script consistency, testing, and more. Besides, EXECNAME is used for *very* specific things: e.g.: https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L67 https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh#L20 are great examples where the execname is exactly what needs to be reported. .. and that's even before 3rd party add-ons that might expect HADOOP_SHELL_EXECNAME to work as expected. If distributions really are renaming the scripts (which is extremely problematic for lots of reasons), there isn't much of a reason they couldn't just tuck them away in a non-PATH directory and use the same names or even just rewrite the scripts directly. (See above about removing as much logic as possible.) I've had in my head a "vendor" version of hadoop-user-function.sh, but I'm not sure if even that would help here. It really depends upon the why the bin scripts are getting renamed, if the problem being solved is actually more appropriate for hadoop-layout.sh, etc. I see nothing but pain and misfortune for mucking with HADOOP_SHELL_EXECNAME though. was (Author: aw): bq. since the calling script always knows what is necessary? I'd need to be convinced this is true. A lot of the work done in the shell script rewrite and follow on work was to make the "front end" scripts as dumb as possible in order to centralize the program logic. This gave huge benefits in the form of script consistency, testing, and more. Besides, CLASSNAME and EXECNAME are used for *very* different things and aren't guaranteed to match. e.g.: https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L67 https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh#L20 are great examples where the execname is exactly what needs to be reported. .. and that's even before 3rd party add-ons that might expect HADOOP_SHELL_EXECNAME to work as expected. If distributions really are renaming the scripts (which is extremely problematic for lots of reasons), there isn't much of a reason they couldn't just tuck them away in a non-PATH directory and use the same names or even just rewrite the scripts directly. (See above about removing as much logic as possible.) I've had in my head a "vendor" version of hadoop-user-function.sh, but I'm not sure if even that would help here. It really depends upon the why the bin scripts are getting renamed, if the problem being solved is actually more appropriate for hadoop-layout.sh, etc. I see nothing but pain and misfortune for mucking with HADOOP_SHELL_EXECNAME though. > Allow overriding HADOOP_SHELL_EXECNAME > -- > > Key: HADOOP-14976 > URL: https://issues.apache.org/jira/browse/HADOOP-14976 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Arpit Agarwal > > Some Hadoop shell scripts infer their own name using this bit of shell magic: > {code} > 18 MYNAME="${BASH_SOURCE-$0}" > 19 HADOOP_SHELL_EXECNAME="${MYNAME##*/}" > {code} > e.g. see the > [hdfs|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L18] > script. > The inferred shell script name is later passed to _hadoop-functions.sh_ which > uses it to construct the names of some environment variables. E.g. when > invoking _hdfs datanode_, the options variable name is inferred as follows: > {code} > # HDFS + DATANODE + OPTS -> HDFS_DATANODE_OPTS > {code} > This works well if the calling script name is standard {{hdfs}} or {{yarn}}. > If a distribution renames the script to something like foo.bar, , then the > variable names will be inferred as {{FOO.BAR_DATANODE_OPTS}}. This is not a > valid bash variable name. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14976) Allow overriding HADOOP_SHELL_EXECNAME
[ https://issues.apache.org/jira/browse/HADOOP-14976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217705#comment-16217705 ] Allen Wittenauer commented on HADOOP-14976: --- bq. since the calling script always knows what is necessary? I'd need to be convinced this is true. A lot of the work done in the shell script rewrite and follow on work was to make the "front end" scripts as dumb as possible in order to centralize the program logic. This gave huge benefits in the form of script consistency, testing, and more. Besides, CLASSNAME and EXECNAME are used for *very* different things and aren't guaranteed to match. e.g.: https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L67 https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh#L20 are great examples where the execname is exactly what needs to be reported. .. and that's even before 3rd party add-ons that might expect HADOOP_SHELL_EXECNAME to work as expected. If distributions really are renaming the scripts (which is extremely problematic for lots of reasons), there isn't much of a reason they couldn't just tuck them away in a non-PATH directory and use the same names or even just rewrite the scripts directly. (See above about removing as much logic as possible.) I've had in my head a "vendor" version of hadoop-user-function.sh, but I'm not sure if even that would help here. It really depends upon the why the bin scripts are getting renamed, if the problem being solved is actually more appropriate for hadoop-layout.sh, etc. I see nothing but pain and misfortune for mucking with HADOOP_SHELL_EXECNAME though. > Allow overriding HADOOP_SHELL_EXECNAME > -- > > Key: HADOOP-14976 > URL: https://issues.apache.org/jira/browse/HADOOP-14976 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Arpit Agarwal > > Some Hadoop shell scripts infer their own name using this bit of shell magic: > {code} > 18 MYNAME="${BASH_SOURCE-$0}" > 19 HADOOP_SHELL_EXECNAME="${MYNAME##*/}" > {code} > e.g. see the > [hdfs|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L18] > script. > The inferred shell script name is later passed to _hadoop-functions.sh_ which > uses it to construct the names of some environment variables. E.g. when > invoking _hdfs datanode_, the options variable name is inferred as follows: > {code} > # HDFS + DATANODE + OPTS -> HDFS_DATANODE_OPTS > {code} > This works well if the calling script name is standard {{hdfs}} or {{yarn}}. > If a distribution renames the script to something like foo.bar, , then the > variable names will be inferred as {{FOO.BAR_DATANODE_OPTS}}. This is not a > valid bash variable name. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14952) Catalina use of hadoop-client throws ClassNotFoundException for jersey
[ https://issues.apache.org/jira/browse/HADOOP-14952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217671#comment-16217671 ] Sean Busbey commented on HADOOP-14952: -- bump? [~eximius] could you add the requested info? > Catalina use of hadoop-client throws ClassNotFoundException for jersey > --- > > Key: HADOOP-14952 > URL: https://issues.apache.org/jira/browse/HADOOP-14952 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0-beta1 >Reporter: Kamil > > I was using org.apache.hadoop:hadoop-client in version 2.7.4 and it worked > fine, but recently had problems with CGLIB (was conflicting with Spring). > I decided to try version 3.0.0-beta1 but server didn't start with exception: > {code} > 16-Oct-2017 10:27:12.918 SEVERE [localhost-startStop-1] > org.apache.catalina.core.ContainerBase.addChildInternal > ContainerBase.addChild: start: > org.apache.catalina.LifecycleException: Failed to start component > [StandardEngine[Catalina].StandardHost[localhost].StandardContext[]] > at > org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:158) > at > org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:724) > at > org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:700) > at > org.apache.catalina.core.StandardHost.addChild(StandardHost.java:734) > at > org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1107) > at > org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:1841) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NoClassDefFoundError: > com/sun/jersey/api/core/DefaultResourceConfig > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.catalina.startup.WebappServiceLoader.loadServices(WebappServiceLoader.java:188) > at > org.apache.catalina.startup.WebappServiceLoader.load(WebappServiceLoader.java:159) > at > org.apache.catalina.startup.ContextConfig.processServletContainerInitializers(ContextConfig.java:1611) > at > org.apache.catalina.startup.ContextConfig.webConfig(ContextConfig.java:1131) > at > org.apache.catalina.startup.ContextConfig.configureStart(ContextConfig.java:771) > at > org.apache.catalina.startup.ContextConfig.lifecycleEvent(ContextConfig.java:298) > at > org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:94) > at > org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5092) > at > org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:152) > ... 10 more > Caused by: java.lang.ClassNotFoundException: > com.sun.jersey.api.core.DefaultResourceConfig > at > org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1299) > at > org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1133) > ... 21 more > {code} > after adding com.sun.jersey:jersey-server:1.9.1 to my dependencies the server > started, but I think that it should be already included in your dependencies -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14976) Allow overriding HADOOP_SHELL_EXECNAME
[ https://issues.apache.org/jira/browse/HADOOP-14976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217647#comment-16217647 ] Arpit Agarwal commented on HADOOP-14976: bq. Worse, it pretty much breaks the Hadoop user experience. Agreed, not a great user experience to change names of standard environment variables. If the variable names are fixed, is there a benefit to inferring HADOOP_SHELL_EXECNAME instead of passing a fixed string to {{hadoop_subcommand_opts}}, since the calling script always knows what is necessary? e.g. do you see any downside to updating the following line https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L269 to {code} hadoop_java_exec hdfs "${HADOOP_CLASSNAME}" "${HADOOP_SUBCMD_ARGS[@]}" {code} > Allow overriding HADOOP_SHELL_EXECNAME > -- > > Key: HADOOP-14976 > URL: https://issues.apache.org/jira/browse/HADOOP-14976 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Arpit Agarwal > > Some Hadoop shell scripts infer their own name using this bit of shell magic: > {code} > 18 MYNAME="${BASH_SOURCE-$0}" > 19 HADOOP_SHELL_EXECNAME="${MYNAME##*/}" > {code} > e.g. see the > [hdfs|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L18] > script. > The inferred shell script name is later passed to _hadoop-functions.sh_ which > uses it to construct the names of some environment variables. E.g. when > invoking _hdfs datanode_, the options variable name is inferred as follows: > {code} > # HDFS + DATANODE + OPTS -> HDFS_DATANODE_OPTS > {code} > This works well if the calling script name is standard {{hdfs}} or {{yarn}}. > If a distribution renames the script to something like foo.bar, , then the > variable names will be inferred as {{FOO.BAR_DATANODE_OPTS}}. This is not a > valid bash variable name. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14976) Allow overriding HADOOP_SHELL_EXECNAME
[ https://issues.apache.org/jira/browse/HADOOP-14976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217612#comment-16217612 ] Allen Wittenauer commented on HADOOP-14976: --- bq. This is not a valid bash variable name. Worse, it pretty much breaks the Hadoop user experience. bq. There may be better alternatives. Maybe the bigger question is why would a distribution rename the binaries. With hadoop 3.x being significantly more flexible as to the shell environment configuration (e.g., function overrides) those distributions might need to reconsider their strategies. > Allow overriding HADOOP_SHELL_EXECNAME > -- > > Key: HADOOP-14976 > URL: https://issues.apache.org/jira/browse/HADOOP-14976 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Arpit Agarwal > > Some Hadoop shell scripts infer their own name using this bit of shell magic: > {code} > 18 MYNAME="${BASH_SOURCE-$0}" > 19 HADOOP_SHELL_EXECNAME="${MYNAME##*/}" > {code} > e.g. see the > [hdfs|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L18] > script. > The inferred shell script name is later passed to _hadoop-functions.sh_ which > uses it to construct the names of some environment variables. E.g. when > invoking _hdfs datanode_, the options variable name is inferred as follows: > {code} > # HDFS + DATANODE + OPTS -> HDFS_DATANODE_OPTS > {code} > This works well if the calling script name is standard {{hdfs}} or {{yarn}}. > If a distribution renames the script to something like foo.bar, , then the > variable names will be inferred as {{FOO.BAR_DATANODE_OPTS}}. This is not a > valid bash variable name. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14976) Allow overriding HADOOP_SHELL_EXECNAME
[ https://issues.apache.org/jira/browse/HADOOP-14976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217531#comment-16217531 ] Arpit Agarwal commented on HADOOP-14976: Allowing calling scripts to override the HADOOP_SHELL_EXECNAME detection is just one possible solution. There may be better alternatives. > Allow overriding HADOOP_SHELL_EXECNAME > -- > > Key: HADOOP-14976 > URL: https://issues.apache.org/jira/browse/HADOOP-14976 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Arpit Agarwal > > Some Hadoop shell scripts infer their own name using this bit of shell magic: > {code} > 18 MYNAME="${BASH_SOURCE-$0}" > 19 HADOOP_SHELL_EXECNAME="${MYNAME##*/}" > {code} > e.g. see the > [hdfs|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L18] > script. > The inferred shell script name is later passed to _hadoop-functions.sh_ which > uses it to construct the names of some environment variables. E.g. when > invoking _hdfs datanode_, the options variable name is inferred as follows: > {code} > # HDFS + DATANODE + OPTS -> HDFS_DATANODE_OPTS > {code} > This works well if the calling script name is standard {{hdfs}} or {{yarn}}. > If a distribution renames the script to something like foo.bar, , then the > variable names will be inferred as {{FOO.BAR_DATANODE_OPTS}}. This is not a > valid bash variable name. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14870) backport HADOOP-14553 parallel tests to branch-2
[ https://issues.apache.org/jira/browse/HADOOP-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated HADOOP-14870: Target Version/s: 3.1.0 (was: 2.9.0) > backport HADOOP-14553 parallel tests to branch-2 > > > Key: HADOOP-14870 > URL: https://issues.apache.org/jira/browse/HADOOP-14870 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/azure, test >Affects Versions: 2.9.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-14870-branch-2-001.patch, > HADOOP-14870-branch-2-002.patch > > > Backport the HADOOP-14553 parallel test running from trunk to branch-2. > There's some complexity related to The FS Contract base test being JUnit4 in > branch -2, so its not a simple cherrypick. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14870) backport HADOOP-14553 parallel tests to branch-2
[ https://issues.apache.org/jira/browse/HADOOP-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217522#comment-16217522 ] Subru Krishnan commented on HADOOP-14870: - Thanks [~ste...@apache.org] for the clarification. I am moving it out of 2.9 based on your timeline. > backport HADOOP-14553 parallel tests to branch-2 > > > Key: HADOOP-14870 > URL: https://issues.apache.org/jira/browse/HADOOP-14870 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/azure, test >Affects Versions: 2.9.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-14870-branch-2-001.patch, > HADOOP-14870-branch-2-002.patch > > > Backport the HADOOP-14553 parallel test running from trunk to branch-2. > There's some complexity related to The FS Contract base test being JUnit4 in > branch -2, so its not a simple cherrypick. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-14976) Allow overriding HADOOP_SHELL_EXECNAME
[ https://issues.apache.org/jira/browse/HADOOP-14976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HADOOP-14976: -- Assignee: (was: Mukul Kumar Singh) > Allow overriding HADOOP_SHELL_EXECNAME > -- > > Key: HADOOP-14976 > URL: https://issues.apache.org/jira/browse/HADOOP-14976 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Arpit Agarwal > > Some Hadoop shell scripts infer their own name using this bit of shell magic: > {code} > 18 MYNAME="${BASH_SOURCE-$0}" > 19 HADOOP_SHELL_EXECNAME="${MYNAME##*/}" > {code} > e.g. see the > [hdfs|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L18] > script. > The inferred shell script name is later passed to _hadoop-functions.sh_ which > uses it to construct the names of some environment variables. E.g. when > invoking _hdfs datanode_, the options variable name is inferred as follows: > {code} > # HDFS + DATANODE + OPTS -> HDFS_DATANODE_OPTS > {code} > This works well if the calling script name is standard {{hdfs}} or {{yarn}}. > If a distribution renames the script to something like foo.bar, , then the > variable names will be inferred as {{FOO.BAR_DATANODE_OPTS}}. This is not a > valid bash variable name. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-14976) Allow overriding HADOOP_SHELL_EXECNAME
[ https://issues.apache.org/jira/browse/HADOOP-14976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HADOOP-14976: -- Assignee: Mukul Kumar Singh > Allow overriding HADOOP_SHELL_EXECNAME > -- > > Key: HADOOP-14976 > URL: https://issues.apache.org/jira/browse/HADOOP-14976 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Arpit Agarwal >Assignee: Mukul Kumar Singh > > Some Hadoop shell scripts infer their own name using this bit of shell magic: > {code} > 18 MYNAME="${BASH_SOURCE-$0}" > 19 HADOOP_SHELL_EXECNAME="${MYNAME##*/}" > {code} > e.g. see the > [hdfs|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L18] > script. > The inferred shell script name is later passed to _hadoop-functions.sh_ which > uses it to construct the names of some environment variables. E.g. when > invoking _hdfs datanode_, the options variable name is inferred as follows: > {code} > # HDFS + DATANODE + OPTS -> HDFS_DATANODE_OPTS > {code} > This works well if the calling script name is standard {{hdfs}} or {{yarn}}. > If a distribution renames the script to something like foo.bar, , then the > variable names will be inferred as {{FOO.BAR_DATANODE_OPTS}}. This is not a > valid bash variable name. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Moved] (HADOOP-14976) Allow overriding HADOOP_SHELL_EXECNAME
[ https://issues.apache.org/jira/browse/HADOOP-14976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal moved HDFS-12706 to HADOOP-14976: --- Key: HADOOP-14976 (was: HDFS-12706) Project: Hadoop Common (was: Hadoop HDFS) > Allow overriding HADOOP_SHELL_EXECNAME > -- > > Key: HADOOP-14976 > URL: https://issues.apache.org/jira/browse/HADOOP-14976 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Arpit Agarwal > > Some Hadoop shell scripts infer their own name using this bit of shell magic: > {code} > 18 MYNAME="${BASH_SOURCE-$0}" > 19 HADOOP_SHELL_EXECNAME="${MYNAME##*/}" > {code} > e.g. see the > [hdfs|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L18] > script. > The inferred shell script name is later passed to _hadoop-functions.sh_ which > uses it to construct the names of some environment variables. E.g. when > invoking _hdfs datanode_, the options variable name is inferred as follows: > {code} > # HDFS + DATANODE + OPTS -> HDFS_DATANODE_OPTS > {code} > This works well if the calling script name is standard {{hdfs}} or {{yarn}}. > If a distribution renames the script to something like foo.bar, , then the > variable names will be inferred as {{FOO.BAR_DATANODE_OPTS}}. This is not a > valid bash variable name. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14957) ReconfigurationTaskStatus is exposing guava Optional in its public api
[ https://issues.apache.org/jira/browse/HADOOP-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217250#comment-16217250 ] Xiao Chen commented on HADOOP-14957: Thanks Steve for the review and the thoughts! Sorry I wasn't clear in my comment. I meant to express that, the word 'Management Tools' is pretty clear, so IMO future developers seeing this annotation should be able to make the mental link to Ambari or equivalent tools. :) Checkstyle is not related to the patch. Failed tests are mostly time outs, and all passed locally. Would like to commit this in by Wednesday if no objections. > ReconfigurationTaskStatus is exposing guava Optional in its public api > -- > > Key: HADOOP-14957 > URL: https://issues.apache.org/jira/browse/HADOOP-14957 > Project: Hadoop Common > Issue Type: Sub-task > Components: common >Affects Versions: 3.0.0-beta1 >Reporter: Haibo Chen >Assignee: Xiao Chen > Attachments: HADOOP-14957.01.patch, HADOOP-14957.prelim.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14178) Move Mockito up to version 2.x
[ https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217206#comment-16217206 ] Hadoop QA commented on HADOOP-14178: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 89 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 12m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 28m 31s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-hdfs-project/hadoop-hdfs-native-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-mapreduce-project/hadoop-mapreduce-client hadoop-mapreduce-project hadoop-client-modules/hadoop-client-minicluster . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 30m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 31s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 40m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 22s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 11m 22s{color} | {color:red} root generated 602 new + 1225 unchanged - 23 fixed = 1827 total (was 1248) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 36s{color} | {color:orange} root: The patch generated 11 new + 2486 unchanged - 19 fixed = 2497 total (was 2505) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 8m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 46s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-hdfs-project/hadoop-hdfs-native-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-mapreduce-project/hadoop-mapreduce-client hadoop-mapreduce-project hadoop-client-modules/hadoop-client-minicluster . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 29m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 43s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} |
[jira] [Commented] (HADOOP-14919) BZip2 drops records when reading data in splits
[ https://issues.apache.org/jira/browse/HADOOP-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217177#comment-16217177 ] Jason Lowe commented on HADOOP-14919: - [~lewuathe] [~ajisakaa] this is a bug related to the changes in HADOOP-13270. Would you have time to take a look? > BZip2 drops records when reading data in splits > --- > > Key: HADOOP-14919 > URL: https://issues.apache.org/jira/browse/HADOOP-14919 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha1 >Reporter: Aki Tanaka >Assignee: Jason Lowe >Priority: Critical > Attachments: 25.bz2, HADOOP-14919-test.patch, > HADOOP-14919.001.patch > > > BZip2 can drop records when reading data in splits. This problem was already > discussed before in HADOOP-11445 and HADOOP-13270. But we still have a > problem in corner case, causing lost data blocks. > > I attached a unit test for this issue. You can reproduce the problem if you > run the unit test. > > First, this issue happens when position of newly created stream is equal to > start of split. Hadoop has some test cases for this (blockEndingInCR.txt.bz2 > file for TestLineRecordReader#testBzip2SplitStartAtBlockMarker, etc). > However, the issue I am reporting does not happen when we run these tests > because this issue happens only when the start of split byte block includes > both block marker and compressed data. > > BZip2 block marker - 0x314159265359 > (00110001010101011001001001100101001101011001) > > blockEndingInCR.txt.bz2 (Start of Split - 136504): > {code:java} > $ xxd -l 6 -g 1 -b -seek 136498 > ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-classes/blockEndingInCR.txt.bz2 > 0021532: 00110001 0101 01011001 00100110 01010011 01011001 1AY > {code} > > Test bz2 File (Start of Split - 203426) > {code:java} > $ xxd -l 7 -g 1 -b -seek 203419 25.bz2 > 0031a9b: 11100110 00101000 00101011 00100100 11001010 01101011 .(+$.k > 0031aa1: 0010 / > {code} > > Let's say a job splits this test bz2 file into two splits at the start of > split (position 203426). > The former split does not read records which start position 203426 because > BZip2 says the position of these dropped records is 203427. The latter split > does not read the records because BZip2CompressionInputStream read the block > from position 320955. > Due to this behavior, records between 203427 and 320955 are lost. > Also, if we reverted the changes in HADOOP-13270, we will not see this issue. > We will see HADOOP-13270 issue though. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14973) [s3a] Log StorageStatistics
[ https://issues.apache.org/jira/browse/HADOOP-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-14973: --- Issue Type: Sub-task (was: Bug) Parent: HADOOP-14831 > [s3a] Log StorageStatistics > --- > > Key: HADOOP-14973 > URL: https://issues.apache.org/jira/browse/HADOOP-14973 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0-beta1, 2.8.1 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > > S3A is currently storing much more detailed metrics via StorageStatistics > than are logged in a MapReduce job. Eventually, it would be nice to get > Spark, MapReduce and other workloads to retrieve and store these metrics, but > it may be some time before they all do that. I'd like to consider having S3A > publish the metrics itself in some form. This is tricky, as S3A has no daemon > but lives inside various other processes. > Perhaps writing to a log file at some configurable interval and on close() > would be the best we could do. Other ideas would be welcome. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14973) [s3a] Log StorageStatistics
[ https://issues.apache.org/jira/browse/HADOOP-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-14973: --- Summary: [s3a] Log StorageStatistics (was: Log StorageStatistics) > [s3a] Log StorageStatistics > --- > > Key: HADOOP-14973 > URL: https://issues.apache.org/jira/browse/HADOOP-14973 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 3.0.0-beta1, 2.8.1 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > > S3A is currently storing much more detailed metrics via StorageStatistics > than are logged in a MapReduce job. Eventually, it would be nice to get > Spark, MapReduce and other workloads to retrieve and store these metrics, but > it may be some time before they all do that. I'd like to consider having S3A > publish the metrics itself in some form. This is tricky, as S3A has no daemon > but lives inside various other processes. > Perhaps writing to a log file at some configurable interval and on close() > would be the best we could do. Other ideas would be welcome. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14973) Log StorageStatistics
[ https://issues.apache.org/jira/browse/HADOOP-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-14973: --- Affects Version/s: 3.0.0-beta1 2.8.1 > Log StorageStatistics > - > > Key: HADOOP-14973 > URL: https://issues.apache.org/jira/browse/HADOOP-14973 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 3.0.0-beta1, 2.8.1 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > > S3A is currently storing much more detailed metrics via StorageStatistics > than are logged in a MapReduce job. Eventually, it would be nice to get > Spark, MapReduce and other workloads to retrieve and store these metrics, but > it may be some time before they all do that. I'd like to consider having S3A > publish the metrics itself in some form. This is tricky, as S3A has no daemon > but lives inside various other processes. > Perhaps writing to a log file at some configurable interval and on close() > would be the best we could do. Other ideas would be welcome. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14950) har file system throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HADOOP-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217076#comment-16217076 ] Yu-Tang Lin commented on HADOOP-14950: -- [~jojochuang], how do you think about the whitespace and asf-license check fail? I thought we can ignore them because the file is the index of HAR. > har file system throws ArrayIndexOutOfBoundsException > - > > Key: HADOOP-14950 > URL: https://issues.apache.org/jira/browse/HADOOP-14950 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0 > Environment: CDH 5.9.2 >Reporter: Wei-Chiu Chuang >Assignee: Yu-Tang Lin > Labels: newbie > Attachments: HADOOP-14950-branch-3.0.001.patch, > HADOOP-14950-branch-3.0.002.patch, HADOOP-14950-branch-3.0.003.patch > > > When listing a har file system file, it throws an AIOOBE like the following: > {noformat} > $ hdfs dfs -ls har:///abc.har > -ls: Fatal internal error > java.lang.ArrayIndexOutOfBoundsException: 1 > at org.apache.hadoop.fs.HarFileSystem$HarStatus.(HarFileSystem.java:597) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.parseMetaData(HarFileSystem.java:1201) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.access$000(HarFileSystem.java:1098) > at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:166) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2711) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:382) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325) > at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235) > at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:102) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > {noformat} > Checking the code, it looks like the _index file in the har is mal-formed. It > expects two string separately by a space in each line, and this AIOOBE is > possible if the second string does not exist. > File this jira to improve the error handling of such case. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-14975) S3AInputStream/OutputStream statistics aren't getting into StorageStatistics
Steve Loughran created HADOOP-14975: --- Summary: S3AInputStream/OutputStream statistics aren't getting into StorageStatistics Key: HADOOP-14975 URL: https://issues.apache.org/jira/browse/HADOOP-14975 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 2.9.0 Reporter: Steve Loughran Priority: Minor when the input and output stream stats are merged into the S3AInstrumentation, the FS statistics aren't updated to match, so FS statistics don't track things like aggregate throttle count, TCP aborts, bytes discarded etc. They are metrics, but not sotrage stats They should be, which requires S3AInstrumentation to take the StorageStats in its constructor and then update. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14178) Move Mockito up to version 2.x
[ https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216843#comment-16216843 ] Steve Loughran commented on HADOOP-14178: - don't worry about KDIag; HADOOP-14030 fixes that (parallel test run & no unique keytab paths problem) > Move Mockito up to version 2.x > -- > > Key: HADOOP-14178 > URL: https://issues.apache.org/jira/browse/HADOOP-14178 > Project: Hadoop Common > Issue Type: Sub-task > Components: test >Affects Versions: 2.9.0 >Reporter: Steve Loughran >Assignee: Akira Ajisaka > Attachments: HADOOP-14178.001.patch, HADOOP-14178.002.patch, > HADOOP-14178.003.patch, HADOOP-14178.004.patch, HADOOP-14178.005-wip.patch, > HADOOP-14178.005.patch > > > I don't know when Hadoop picked up Mockito, but it has been frozen at 1.8.5 > since the switch to maven in 2011. > Mockito is now at version 2.1, [with lots of Java 8 > support|https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2]. > That' s not just defining actions as closures, but in supporting Optional > types, mocking methods in interfaces, etc. > It's only used for testing, and, *provided there aren't regressions*, cost of > upgrade is low. The good news: test tools usually come with good test > coverage. The bad: mockito does go deep into java bytecodes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-11981) Add storage policy APIs to filesystem docs
[ https://issues.apache.org/jira/browse/HADOOP-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216841#comment-16216841 ] Steve Loughran commented on HADOOP-11981: - this is also something where a StreamCapabilities implementation by filesystems should expose as a feature which can be queried for, rather than rely on UnsupportedException being thrown as the sole probe > Add storage policy APIs to filesystem docs > -- > > Key: HADOOP-11981 > URL: https://issues.apache.org/jira/browse/HADOOP-11981 > Project: Hadoop Common > Issue Type: Sub-task > Components: documentation >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HADOOP-11981.incomplete.01.patch > > > HDFS-8345 exposed the storage policy APIs via the FileSystem. > The FileSystem docs should be updated accordingly. > https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14030) PreCommit TestKDiag failure
[ https://issues.apache.org/jira/browse/HADOOP-14030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216838#comment-16216838 ] Steve Loughran commented on HADOOP-14030: - Well, TestKDiag didn't fail +1 well done for tracking this down > PreCommit TestKDiag failure > --- > > Key: HADOOP-14030 > URL: https://issues.apache.org/jira/browse/HADOOP-14030 > Project: Hadoop Common > Issue Type: Bug > Components: security >Affects Versions: 3.0.0-alpha4 >Reporter: John Zhuge >Assignee: Wei-Chiu Chuang > Attachments: HADOOP-14030.001.patch > > > https://builds.apache.org/job/PreCommit-HADOOP-Build/11523/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt > {noformat} > Tests run: 13, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 2.175 sec > <<< FAILURE! - in org.apache.hadoop.security.TestKDiag > testKeytabAndPrincipal(org.apache.hadoop.security.TestKDiag) Time elapsed: > 0.05 sec <<< ERROR! > org.apache.hadoop.security.KerberosAuthException: Login failure for user: > f...@example.com from keytab > /testptch/hadoop/hadoop-common-project/hadoop-common/target/keytab > javax.security.auth.login.LoginException: Unable to obtain password from user > at > com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:897) > at > com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760) > at > com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) > at > javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) > at java.security.AccessController.doPrivileged(Native Method) > at > javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) > at javax.security.auth.login.LoginContext.login(LoginContext.java:587) > at > org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytabAndReturnUGI(UserGroupInformation.java:1355) > at org.apache.hadoop.security.KDiag.loginFromKeytab(KDiag.java:630) > at org.apache.hadoop.security.KDiag.execute(KDiag.java:396) > at org.apache.hadoop.security.KDiag.run(KDiag.java:236) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.security.KDiag.exec(KDiag.java:1047) > at org.apache.hadoop.security.TestKDiag.kdiag(TestKDiag.java:119) > at > org.apache.hadoop.security.TestKDiag.testKeytabAndPrincipal(TestKDiag.java:162) > testFileOutput(org.apache.hadoop.security.TestKDiag) Time elapsed: 0.033 sec > <<< ERROR! > org.apache.hadoop.security.KerberosAuthException: Login failure for user: > f...@example.com from keytab > /testptch/hadoop/hadoop-common-project/hadoop-common/target/keytab > javax.security.auth.login.LoginException: Unable to obtain password from user > at > com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:897) > at > com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760) > at > com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) > at > javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) > at java.security.AccessController.doPrivileged(Native Method) > at > javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) > at javax.security.auth.login.LoginContext.login(LoginContext.java:587) > at > org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytabAndReturnUGI(UserGroupInformation.java:1355) > at org.apache.hadoop.security.KDiag.loginFromKeytab(KDiag.java:630) > at
[jira] [Commented] (HADOOP-14950) har file system throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HADOOP-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216835#comment-16216835 ] Hadoop QA commented on HADOOP-14950: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} branch-3.0 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 48s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 27s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 23s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 35s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} branch-3.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 30s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 38s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch generated 6 new + 71 unchanged - 1 fixed = 77 total (was 72) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 8m 53s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 13s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 30s{color} | {color:red} The patch generated 2 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 75m 6s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:612350e | | JIRA Issue | HADOOP-14950 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12893713/HADOOP-14950-branch-3.0.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux aca18355a620 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.0 / 1e7ea66 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HADOOP-Build/13570/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt | | whitespace | https://builds.apache.org/job/PreCommit-HADOOP-Build/13570/artifact/patchprocess/whitespace-eol.txt | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/13570/testReport/ | | asflicense |
[jira] [Commented] (HADOOP-14475) Metrics of S3A don't print out when enable it in Hadoop metrics property file
[ https://issues.apache.org/jira/browse/HADOOP-14475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216729#comment-16216729 ] Hadoop QA commented on HADOOP-14475: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} HADOOP-14475 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HADOOP-14475 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879983/HADOOP-14475.006.patch | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/13569/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Metrics of S3A don't print out when enable it in Hadoop metrics property file > -- > > Key: HADOOP-14475 > URL: https://issues.apache.org/jira/browse/HADOOP-14475 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 2.8.0 > Environment: uname -a > Linux client01 4.4.0-74-generic #95-Ubuntu SMP Wed Apr 12 09:50:34 UTC 2017 > x86_64 x86_64 x86_64 GNU/Linux > cat /etc/issue > Ubuntu 16.04.2 LTS \n \l >Reporter: Yonger >Assignee: Yonger > Attachments: HADOOP-14475-003.patch, HADOOP-14475.002.patch, > HADOOP-14475.005.patch, HADOOP-14475.006.patch, failsafe-report-s3a-it.html, > failsafe-report-s3a-scale.html, failsafe-report-scale.html, > failsafe-report-scale.zip, s3a-metrics.patch1, stdout.zip > > > *.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink > #*.sink.file.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink > #*.sink.influxdb.url=http:/xx > #*.sink.influxdb.influxdb_port=8086 > #*.sink.influxdb.database=hadoop > #*.sink.influxdb.influxdb_username=hadoop > #*.sink.influxdb.influxdb_password=hadoop > #*.sink.ingluxdb.cluster=c1 > *.period=10 > #namenode.sink.influxdb.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink > #S3AFileSystem.sink.influxdb.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink > S3AFileSystem.sink.file.filename=s3afilesystem-metrics.out > I can't find the out put file even i run a MR job which should be used s3. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14624) Add GenericTestUtils.DelayAnswer that accept slf4j logger API
[ https://issues.apache.org/jira/browse/HADOOP-14624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216705#comment-16216705 ] Akira Ajisaka commented on HADOOP-14624: This patch does not apply to trunk. Would you rebase the patch, [~vincent he]? > Add GenericTestUtils.DelayAnswer that accept slf4j logger API > - > > Key: HADOOP-14624 > URL: https://issues.apache.org/jira/browse/HADOOP-14624 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Wenxin He >Assignee: Wenxin He > Attachments: HADOOP-14624.001.patch, HADOOP-14624.002.patch > > > Split from HADOOP-14539. > Now GenericTestUtils.DelayAnswer only accepts commons-logging logger API. Now > we are migrating the APIs to slf4j, slf4j logger API should be accepted as > well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14870) backport HADOOP-14553 parallel tests to branch-2
[ https://issues.apache.org/jira/browse/HADOOP-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216698#comment-16216698 ] Steve Loughran commented on HADOOP-14870: - This is nothing to do with HDFS. It's Hadoop Azure tests which don't run on Yetus as it lacks the credentials. I do plan to backport it, but not this week for various reasons > backport HADOOP-14553 parallel tests to branch-2 > > > Key: HADOOP-14870 > URL: https://issues.apache.org/jira/browse/HADOOP-14870 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/azure, test >Affects Versions: 2.9.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-14870-branch-2-001.patch, > HADOOP-14870-branch-2-002.patch > > > Backport the HADOOP-14553 parallel test running from trunk to branch-2. > There's some complexity related to The FS Contract base test being JUnit4 in > branch -2, so its not a simple cherrypick. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14178) Move Mockito up to version 2.x
[ https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HADOOP-14178: --- Attachment: HADOOP-14178.005-wip.patch 005-wip: Fixed some HDFS test failures > Move Mockito up to version 2.x > -- > > Key: HADOOP-14178 > URL: https://issues.apache.org/jira/browse/HADOOP-14178 > Project: Hadoop Common > Issue Type: Sub-task > Components: test >Affects Versions: 2.9.0 >Reporter: Steve Loughran >Assignee: Akira Ajisaka > Attachments: HADOOP-14178.001.patch, HADOOP-14178.002.patch, > HADOOP-14178.003.patch, HADOOP-14178.004.patch, HADOOP-14178.005-wip.patch, > HADOOP-14178.005.patch > > > I don't know when Hadoop picked up Mockito, but it has been frozen at 1.8.5 > since the switch to maven in 2011. > Mockito is now at version 2.1, [with lots of Java 8 > support|https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2]. > That' s not just defining actions as closures, but in supporting Optional > types, mocking methods in interfaces, etc. > It's only used for testing, and, *provided there aren't regressions*, cost of > upgrade is low. The good news: test tools usually come with good test > coverage. The bad: mockito does go deep into java bytecodes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14973) Log StorageStatistics
[ https://issues.apache.org/jira/browse/HADOOP-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216667#comment-16216667 ] Steve Loughran commented on HADOOP-14973: - +output from S3AInputStream.toString, again, logged in scala after some spark work {code} - Duration of readFully(260, byte[256]) [pos = 45603306] = 341,500,815 nS 2017-10-10 16:17:54,090 [ScalaTest-main-running-S3ASeekReadSuite] INFO s3.S3ASeekReadSuite (Logging.scala:logInfo(54)) - S3AInputStream{s3a://landsat-pds/scene_list.gz wrappedStream=open read policy=sequential pos=516 nextReadPos=45603306 contentLength=45603307 contentRangeStart=260 contentRangeFinish=45603307 remainingInCurrentRequest=45602791 StreamStatistics{OpenOperations=6, CloseOperations=5, Closed=2, Aborted=3, SeekOperations=3, ReadExceptions=0, ForwardSeekOperations=0, BackwardSeekOperations=3, BytesSkippedOnSeek=0, BytesBackwardsOnSeek=91206303, BytesRead=815, BytesRead excluding skipped=815, ReadOperations=4, ReadFullyOperations=4, ReadsIncomplete=0, BytesReadInClose=51, BytesDiscardedInAbort=136809661}} {code} That log shows abort ops; this test uses the committer. And here's the stats as collected from a localhost run of some ORC dataframe IO against an s3 bucket in spark. {code} INFO s3.S3AOperations (Logging.scala:logInfo(54)) - Metrics: S3guard_metadatastore_put_path_latency50thPercentileLatency = 1285 S3guard_metadatastore_put_path_latency75thPercentileLatency = 1332 S3guard_metadatastore_put_path_latency90thPercentileLatency = 1332 S3guard_metadatastore_put_path_latency95thPercentileLatency = 1332 S3guard_metadatastore_put_path_latency99thPercentileLatency = 1332 S3guard_metadatastore_put_path_latencyNumOps = 3 S3guard_metadatastore_throttle_rate50thPercentileFrequency (Hz) = 0 S3guard_metadatastore_throttle_rate75thPercentileFrequency (Hz) = 0 S3guard_metadatastore_throttle_rate90thPercentileFrequency (Hz) = 0 S3guard_metadatastore_throttle_rate95thPercentileFrequency (Hz) = 0 S3guard_metadatastore_throttle_rate99thPercentileFrequency (Hz) = 0 S3guard_metadatastore_throttle_rateNumEvents = 0 committer_bytes_committed = 34833950 committer_bytes_uploaded = 34833950 committer_commits_aborted = 0 committer_commits_completed = 45 committer_commits_created = 43 committer_commits_failed = 0 committer_commits_reverted = 0 committer_jobs_completed = 17 committer_jobs_failed = 0 committer_tasks_completed = 21 committer_tasks_failed = 0 directories_created = 31 directories_deleted = 0 fake_directories_deleted = 1127 files_copied = 11 files_copied_bytes = 128519807 files_created = 46 files_deleted = 34 ignored_errors = 100 object_continue_list_requests = 0 object_copy_requests = 0 object_delete_requests = 174 object_list_requests = 514 object_metadata_requests = 1081 object_multipart_aborted = 0 object_put_bytes = 163448561 object_put_bytes_pending = 0 object_put_requests = 142 object_put_requests_active = 0 object_put_requests_completed = 142 op_copy_from_local_file = 0 op_exists = 75 op_get_file_status = 595 op_glob_status = 12 op_is_directory = 0 op_is_file = 0 op_list_files = 10 op_list_located_status = 8 op_list_status = 63 op_mkdirs = 17 op_rename = 11 s3guard_metadatastore_initialization = 0 s3guard_metadatastore_put_path_request = 380 s3guard_metadatastore_retry = 0 s3guard_metadatastore_throttled = 0 store_io_throttled = 0 stream_aborted = 0 stream_backward_seek_operations = 20 stream_bytes_backwards_on_seek = 23110 stream_bytes_discarded_in_abort = 0 stream_bytes_read = 68258611 stream_bytes_read_in_close = 4417 stream_bytes_skipped_on_seek = 0 stream_close_operations = 95 stream_closed = 95 stream_forward_seek_operations = 0 stream_opened = 95 stream_read_exceptions = 0 stream_read_fully_operations = 0 stream_read_operations = 16774 stream_read_operations_incomplete = 7828 stream_seek_operations = 20 stream_write_block_uploads = 18 stream_write_block_uploads_aborted = 0 stream_write_block_uploads_active = 0 stream_write_block_uploads_committed = 0 stream_write_block_uploads_data_pending = 0 stream_write_block_uploads_pending = 42 stream_write_failures = 0 stream_write_total_data = 128624034 stream_write_total_time = 354315 {code} This is the committer code, so it tracks that and throttling stats. Throttle is interesting as it's not just per-thread, it's per-all-clients of a shard in a bucket. At least collecting on a per-query basis will let you know that the reason something is slow is that the job was throttled (so: fix that, tune backoff or reduce #of workers) > Log StorageStatistics > - > > Key: HADOOP-14973 > URL: https://issues.apache.org/jira/browse/HADOOP-14973 > Project: Hadoop Common > Issue Type: Bug >
[jira] [Commented] (HADOOP-14973) Log StorageStatistics
[ https://issues.apache.org/jira/browse/HADOOP-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216638#comment-16216638 ] Steve Loughran commented on HADOOP-14973: - First, sean, tag versions, give title a hint it's for S3, mark as improvement, move under HADOOP-14831 so it can be tracked for Hadoop 1 Second, you haven't called FileSystem.toString() for a while have you? Or FSDataInputStream.toString()? Because it prints all this stuff. How else do you think all the seek optimisation work was debugged? {code} 2017-10-10 16:23:47,050 [ScalaTest-main-running-S3ADataFrameSuite] INFO s3.S3ADataFrameSuite (Logging.scala:logInfo(54)) - Duration of scan result list = 2,118,450 nS 2017-10-10 16:23:47,050 [ScalaTest-main-running-S3ADataFrameSuite] INFO s3.S3ADataFrameSuite (Logging.scala:logInfo(54)) - FileSystem S3AFileSystem{uri=s3a://hwdev-steve-ireland-new, workingDir=s3a://hwdev-steve-ireland-new/user/stevel, inputPolicy=random, partSize=8388608, enableMultiObjectsDelete=true, maxKeys=5000, readAhead=262144, blockSize=1048576, multiPartThreshold=2147483647, serverSideEncryptionAlgorithm='NONE', blockFactory=org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory@64f6964f, metastore=NullMetadataStore, authoritative=false, useListV1=false, boundedExecutor=BlockingThreadPoolExecutorService{SemaphoredDelegatingExecutor{permitCount=25, available=25, waiting=0}, activeCount=0}, unboundedExecutor=java.util.concurrent.ThreadPoolExecutor@60291e59[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0], statistics {182521443 bytes read, 39004 bytes written, 207 read ops, 0 large read ops, 76 write ops}, metrics {{Context=S3AFileSystem} {FileSystemId=e62eeb1a-cced-473b-95f3-06c9910604ad-hwdev-steve-ireland-new} {fsURI=s3a://hwdev-steve-ireland-new} {files_created=0} {files_copied=0} {files_copied_bytes=0} {files_deleted=0} {fake_directories_deleted=0} {directories_created=0} {directories_deleted=0} {ignored_errors=0} {op_copy_from_local_file=0} {op_exists=0} {op_get_file_status=1} {op_glob_status=0} {op_is_directory=0} {op_is_file=0} {op_list_files=1} {op_list_located_status=0} {op_list_status=0} {op_mkdirs=0} {op_rename=0} {object_copy_requests=0} {object_delete_requests=0} {object_list_requests=2} {object_continue_list_requests=0} {object_metadata_requests=2} {object_multipart_aborted=0} {object_put_bytes=0} {object_put_requests=0} {object_put_requests_completed=0} {stream_write_failures=0} {stream_write_block_uploads=0} {stream_write_block_uploads_committed=0} {stream_write_block_uploads_aborted=0} {stream_write_total_time=0} {stream_write_total_data=0} {committer_commits_created=0} {committer_commits_completed=0} {committer_jobs_completed=0} {committer_jobs_failed=0} {committer_tasks_completed=0} {committer_tasks_failed=0} {committer_bytes_committed=0} {committer_bytes_uploaded=0} {committer_commits_failed=0} {committer_commits_aborted=0} {committer_commits_reverted=0} {s3guard_metadatastore_put_path_request=1} {s3guard_metadatastore_initialization=0} {s3guard_metadatastore_retry=0} {s3guard_metadatastore_throttled=0} {store_io_throttled=0} {object_put_requests_active=0} {object_put_bytes_pending=0} {stream_write_block_uploads_active=0} {stream_write_block_uploads_pending=0} {stream_write_block_uploads_data_pending=0} {S3guard_metadatastore_put_path_latencyNumOps=0} {S3guard_metadatastore_put_path_latency50thPercentileLatency=0} {S3guard_metadatastore_put_path_latency75thPercentileLatency=0} {S3guard_metadatastore_put_path_latency90thPercentileLatency=0} {S3guard_metadatastore_put_path_latency95thPercentileLatency=0} {S3guard_metadatastore_put_path_latency99thPercentileLatency=0} {S3guard_metadatastore_throttle_rateNumEvents=0} {S3guard_metadatastore_throttle_rate50thPercentileFrequency (Hz)=0} {S3guard_metadatastore_throttle_rate75thPercentileFrequency (Hz)=0} {S3guard_metadatastore_throttle_rate90thPercentileFrequency (Hz)=0} {S3guard_metadatastore_throttle_rate95thPercentileFrequency (Hz)=0} {S3guard_metadatastore_throttle_rate99thPercentileFrequency (Hz)=0} {stream_read_fully_operations=0} {stream_opened=0} {stream_bytes_skipped_on_seek=0} {stream_closed=0} {stream_bytes_backwards_on_seek=0} {stream_bytes_read=0} {stream_read_operations_incomplete=0} {stream_bytes_discarded_in_abort=0} {stream_close_operations=0} {stream_read_operations=0} {stream_aborted=0} {stream_forward_seek_operations=0} {stream_backward_seek_operations=0} {stream_seek_operations=0} {stream_bytes_read_in_close=0} {stream_read_exceptions=0} }} - DataFrames 2017-10-10 16:23:47,051 [ScalaTest-main-running-S3ADataFrameSuite] INFO s3.S3ADataFrameSuite (Logging.scala:logInfo(54)) - Cleaning s3a://hwdev-steve-ireland-new/cloud-integration/DELAY_LISTING_ME/S3ADataFrameSuite S3AOrcRelationSuite: {code} See? That's from a Spark {{logInfo(s"Stats $filesystem")}} instruction,
[jira] [Updated] (HADOOP-14950) har file system throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HADOOP-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu-Tang Lin updated HADOOP-14950: - Status: Open (was: Patch Available) > har file system throws ArrayIndexOutOfBoundsException > - > > Key: HADOOP-14950 > URL: https://issues.apache.org/jira/browse/HADOOP-14950 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0 > Environment: CDH 5.9.2 >Reporter: Wei-Chiu Chuang >Assignee: Yu-Tang Lin > Labels: newbie > Attachments: HADOOP-14950-branch-3.0.001.patch, > HADOOP-14950-branch-3.0.002.patch, HADOOP-14950-branch-3.0.003.patch > > > When listing a har file system file, it throws an AIOOBE like the following: > {noformat} > $ hdfs dfs -ls har:///abc.har > -ls: Fatal internal error > java.lang.ArrayIndexOutOfBoundsException: 1 > at org.apache.hadoop.fs.HarFileSystem$HarStatus.(HarFileSystem.java:597) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.parseMetaData(HarFileSystem.java:1201) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.access$000(HarFileSystem.java:1098) > at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:166) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2711) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:382) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325) > at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235) > at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:102) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > {noformat} > Checking the code, it looks like the _index file in the har is mal-formed. It > expects two string separately by a space in each line, and this AIOOBE is > possible if the second string does not exist. > File this jira to improve the error handling of such case. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14950) har file system throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HADOOP-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu-Tang Lin updated HADOOP-14950: - Status: Patch Available (was: Open) > har file system throws ArrayIndexOutOfBoundsException > - > > Key: HADOOP-14950 > URL: https://issues.apache.org/jira/browse/HADOOP-14950 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0 > Environment: CDH 5.9.2 >Reporter: Wei-Chiu Chuang >Assignee: Yu-Tang Lin > Labels: newbie > Attachments: HADOOP-14950-branch-3.0.001.patch, > HADOOP-14950-branch-3.0.002.patch, HADOOP-14950-branch-3.0.003.patch > > > When listing a har file system file, it throws an AIOOBE like the following: > {noformat} > $ hdfs dfs -ls har:///abc.har > -ls: Fatal internal error > java.lang.ArrayIndexOutOfBoundsException: 1 > at org.apache.hadoop.fs.HarFileSystem$HarStatus.(HarFileSystem.java:597) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.parseMetaData(HarFileSystem.java:1201) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.access$000(HarFileSystem.java:1098) > at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:166) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2711) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:382) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325) > at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235) > at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:102) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > {noformat} > Checking the code, it looks like the _index file in the har is mal-formed. It > expects two string separately by a space in each line, and this AIOOBE is > possible if the second string does not exist. > File this jira to improve the error handling of such case. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14950) har file system throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HADOOP-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu-Tang Lin updated HADOOP-14950: - Attachment: HADOOP-14950-branch-3.0.003.patch > har file system throws ArrayIndexOutOfBoundsException > - > > Key: HADOOP-14950 > URL: https://issues.apache.org/jira/browse/HADOOP-14950 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0 > Environment: CDH 5.9.2 >Reporter: Wei-Chiu Chuang >Assignee: Yu-Tang Lin > Labels: newbie > Attachments: HADOOP-14950-branch-3.0.001.patch, > HADOOP-14950-branch-3.0.002.patch, HADOOP-14950-branch-3.0.003.patch > > > When listing a har file system file, it throws an AIOOBE like the following: > {noformat} > $ hdfs dfs -ls har:///abc.har > -ls: Fatal internal error > java.lang.ArrayIndexOutOfBoundsException: 1 > at org.apache.hadoop.fs.HarFileSystem$HarStatus.(HarFileSystem.java:597) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.parseMetaData(HarFileSystem.java:1201) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.access$000(HarFileSystem.java:1098) > at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:166) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2711) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:382) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325) > at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235) > at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:102) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > {noformat} > Checking the code, it looks like the _index file in the har is mal-formed. It > expects two string separately by a space in each line, and this AIOOBE is > possible if the second string does not exist. > File this jira to improve the error handling of such case. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14950) har file system throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HADOOP-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216628#comment-16216628 ] Hadoop QA commented on HADOOP-14950: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} branch-3.0 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 30s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 53s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 37s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} branch-3.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 33s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 37s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch generated 9 new + 71 unchanged - 1 fixed = 80 total (was 72) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 7 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 2 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 8m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 18s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 30s{color} | {color:red} The patch generated 2 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 79m 30s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.TestKDiag | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:612350e | | JIRA Issue | HADOOP-14950 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12893686/HADOOP-14950-branch-3.0.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 5614830bfa4c 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.0 / 1e7ea66 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HADOOP-Build/13568/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt | | whitespace |
[jira] [Commented] (HADOOP-14972) Histogram metrics types for latency, etc.
[ https://issues.apache.org/jira/browse/HADOOP-14972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216627#comment-16216627 ] Steve Loughran commented on HADOOP-14972: - I don't know about perf stats of quantiles but yes, performance would be good. Everyone playing with metrics should spend an afternoon instrumenting an app of theirs with CodaHale metrics and Java 8, I've done something similar with scala in the past, where you can use the way codahale probes its metrics to actually implement the lookup as closures probing the running app, rather than just having the app publishing information which is often not needed at all. See also [~iyonger];s HADOOP-14475 patch, which I have sadly neglected and which I'm aware we need to pull in. Sean: can you look at that patch before we do other things, as I don't want that patch obsoleted by later work. > Histogram metrics types for latency, etc. > - > > Key: HADOOP-14972 > URL: https://issues.apache.org/jira/browse/HADOOP-14972 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.9.0, 3.0.0 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > > We'd like metrics to track latencies for various operations, such as > latencies for various request types, etc. This may need to be done different > from current metrics types that are just counters of type long, and it needs > to be done intelligently as these measurements are very numerous, and are > primarily interesting due to the outliers that are unpredictably far from > normal. A few ideas on how we might implement something like this: > * An adaptive, sparse histogram type. I envision something configurable with > a maximumum granularity and a maximum number of bins. Initially, datapoints > are tallied in bins with the maximum granularity. As we reach the maximum > number of bins, bins are merged in even / odd pairs. There's some complexity > here, especially to make it perform well and allow safe concurrency, but I > like the ability to configure reasonable limits and retain as much > granularity as possible without knowing the exact shape of the data > beforehand. > * LongMetrics named "read_latency_600ms", "read_latency_800ms" to represent > bins. This was suggested to me by [~fabbri]. I initially did not like the > idea of having either so many hard-coded bins for however many op types, but > this could also be done dynamically (we just hard-code which measurements we > take, and with what granularity to group them, e.g. read_latency, 200 ms). > The resulting dataset could be sparse and dynamic to allow for extreme > outliers, but the granularity is still pre-determined. > * We could also simply track a certain number of the highest latencies, and > basic descriptive statistics like a running average, min / max, etc. > Inherently more limited in what it can show us, but much simpler and might > still provide some insight when analyzing performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14972) Histogram metrics types for latency, etc.
[ https://issues.apache.org/jira/browse/HADOOP-14972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-14972: Issue Type: Sub-task (was: New Feature) Parent: HADOOP-14831 > Histogram metrics types for latency, etc. > - > > Key: HADOOP-14972 > URL: https://issues.apache.org/jira/browse/HADOOP-14972 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.9.0, 3.0.0 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > > We'd like metrics to track latencies for various operations, such as > latencies for various request types, etc. This may need to be done different > from current metrics types that are just counters of type long, and it needs > to be done intelligently as these measurements are very numerous, and are > primarily interesting due to the outliers that are unpredictably far from > normal. A few ideas on how we might implement something like this: > * An adaptive, sparse histogram type. I envision something configurable with > a maximumum granularity and a maximum number of bins. Initially, datapoints > are tallied in bins with the maximum granularity. As we reach the maximum > number of bins, bins are merged in even / odd pairs. There's some complexity > here, especially to make it perform well and allow safe concurrency, but I > like the ability to configure reasonable limits and retain as much > granularity as possible without knowing the exact shape of the data > beforehand. > * LongMetrics named "read_latency_600ms", "read_latency_800ms" to represent > bins. This was suggested to me by [~fabbri]. I initially did not like the > idea of having either so many hard-coded bins for however many op types, but > this could also be done dynamically (we just hard-code which measurements we > take, and with what granularity to group them, e.g. read_latency, 200 ms). > The resulting dataset could be sparse and dynamic to allow for extreme > outliers, but the granularity is still pre-determined. > * We could also simply track a certain number of the highest latencies, and > basic descriptive statistics like a running average, min / max, etc. > Inherently more limited in what it can show us, but much simpler and might > still provide some insight when analyzing performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14972) Histogram metrics types for latency, etc.
[ https://issues.apache.org/jira/browse/HADOOP-14972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-14972: Issue Type: New Feature (was: Bug) > Histogram metrics types for latency, etc. > - > > Key: HADOOP-14972 > URL: https://issues.apache.org/jira/browse/HADOOP-14972 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3 >Affects Versions: 2.9.0, 3.0.0 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > > We'd like metrics to track latencies for various operations, such as > latencies for various request types, etc. This may need to be done different > from current metrics types that are just counters of type long, and it needs > to be done intelligently as these measurements are very numerous, and are > primarily interesting due to the outliers that are unpredictably far from > normal. A few ideas on how we might implement something like this: > * An adaptive, sparse histogram type. I envision something configurable with > a maximumum granularity and a maximum number of bins. Initially, datapoints > are tallied in bins with the maximum granularity. As we reach the maximum > number of bins, bins are merged in even / odd pairs. There's some complexity > here, especially to make it perform well and allow safe concurrency, but I > like the ability to configure reasonable limits and retain as much > granularity as possible without knowing the exact shape of the data > beforehand. > * LongMetrics named "read_latency_600ms", "read_latency_800ms" to represent > bins. This was suggested to me by [~fabbri]. I initially did not like the > idea of having either so many hard-coded bins for however many op types, but > this could also be done dynamically (we just hard-code which measurements we > take, and with what granularity to group them, e.g. read_latency, 200 ms). > The resulting dataset could be sparse and dynamic to allow for extreme > outliers, but the granularity is still pre-determined. > * We could also simply track a certain number of the highest latencies, and > basic descriptive statistics like a running average, min / max, etc. > Inherently more limited in what it can show us, but much simpler and might > still provide some insight when analyzing performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14972) Histogram metrics types for latency, etc.
[ https://issues.apache.org/jira/browse/HADOOP-14972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-14972: Affects Version/s: 3.0.0 2.9.0 > Histogram metrics types for latency, etc. > - > > Key: HADOOP-14972 > URL: https://issues.apache.org/jira/browse/HADOOP-14972 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3 >Affects Versions: 2.9.0, 3.0.0 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > > We'd like metrics to track latencies for various operations, such as > latencies for various request types, etc. This may need to be done different > from current metrics types that are just counters of type long, and it needs > to be done intelligently as these measurements are very numerous, and are > primarily interesting due to the outliers that are unpredictably far from > normal. A few ideas on how we might implement something like this: > * An adaptive, sparse histogram type. I envision something configurable with > a maximumum granularity and a maximum number of bins. Initially, datapoints > are tallied in bins with the maximum granularity. As we reach the maximum > number of bins, bins are merged in even / odd pairs. There's some complexity > here, especially to make it perform well and allow safe concurrency, but I > like the ability to configure reasonable limits and retain as much > granularity as possible without knowing the exact shape of the data > beforehand. > * LongMetrics named "read_latency_600ms", "read_latency_800ms" to represent > bins. This was suggested to me by [~fabbri]. I initially did not like the > idea of having either so many hard-coded bins for however many op types, but > this could also be done dynamically (we just hard-code which measurements we > take, and with what granularity to group them, e.g. read_latency, 200 ms). > The resulting dataset could be sparse and dynamic to allow for extreme > outliers, but the granularity is still pre-determined. > * We could also simply track a certain number of the highest latencies, and > basic descriptive statistics like a running average, min / max, etc. > Inherently more limited in what it can show us, but much simpler and might > still provide some insight when analyzing performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14957) ReconfigurationTaskStatus is exposing guava Optional in its public api
[ https://issues.apache.org/jira/browse/HADOOP-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216615#comment-16216615 ] Steve Loughran commented on HADOOP-14957: - LGTM +1 bq. I don't see any confusion about what "management tools" can be. Ambari, obviously. I don't actually know if it uses it, but the tag at least marks up for anyone maintaining it in future that this is a management tool API which isn't for broad use, *and which we may break across point releases. Up to the tool teams to track, which, given the nature of the product releases, they should be able to do so. > ReconfigurationTaskStatus is exposing guava Optional in its public api > -- > > Key: HADOOP-14957 > URL: https://issues.apache.org/jira/browse/HADOOP-14957 > Project: Hadoop Common > Issue Type: Sub-task > Components: common >Affects Versions: 3.0.0-beta1 >Reporter: Haibo Chen >Assignee: Xiao Chen > Attachments: HADOOP-14957.01.patch, HADOOP-14957.prelim.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14966) Handle JDK-8071638 for hadoop-common
[ https://issues.apache.org/jira/browse/HADOOP-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated HADOOP-14966: --- Fix Version/s: (was: 3.0.0.) 3.0.0 > Handle JDK-8071638 for hadoop-common > > > Key: HADOOP-14966 > URL: https://issues.apache.org/jira/browse/HADOOP-14966 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.0, 2.9.0, 3.0.0-alpha1 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Blocker > Fix For: 2.9.0, 2.8.3, 3.0.0, 3.1.0 > > Attachments: HADOOP-14966.001.patch > > > Impact modules > -- YARN nodemanger cache clean up > -- Mapreduce Log/History cleaner > Will add jira in YARN & MAPREDUCE to track the same -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14966) Handle JDK-8071638 for hadoop-common
[ https://issues.apache.org/jira/browse/HADOOP-14966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated HADOOP-14966: --- Fix Version/s: 3.0.0. > Handle JDK-8071638 for hadoop-common > > > Key: HADOOP-14966 > URL: https://issues.apache.org/jira/browse/HADOOP-14966 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.8.0, 2.9.0, 3.0.0-alpha1 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Blocker > Fix For: 2.9.0, 2.8.3, 3.1.0, 3.0.0. > > Attachments: HADOOP-14966.001.patch > > > Impact modules > -- YARN nodemanger cache clean up > -- Mapreduce Log/History cleaner > Will add jira in YARN & MAPREDUCE to track the same -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14840) Tool to estimate resource requirements of an application pipeline based on prior executions
[ https://issues.apache.org/jira/browse/HADOOP-14840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216544#comment-16216544 ] Hadoop QA commented on HADOOP-14840: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 31 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 34s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 10m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 20s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-assemblies hadoop-tools/hadoop-tools-dist hadoop-tools . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 5s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 24m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 24m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 22m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 13s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 13s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 28s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 53s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-assemblies . hadoop-tools hadoop-tools/hadoop-tools-dist {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 12m 4s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 23m 45s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}220m 54s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.TestShellBasedUnixGroupsMapping | | |
[jira] [Commented] (HADOOP-14178) Move Mockito up to version 2.x
[ https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216496#comment-16216496 ] Akira Ajisaka commented on HADOOP-14178: bq. I'll run all the tests to check if there are some other test failures. I ran all the tests and found about 50 tests were broken by the upgrade :( Now I'm fixing the failures. Very tough work. > Move Mockito up to version 2.x > -- > > Key: HADOOP-14178 > URL: https://issues.apache.org/jira/browse/HADOOP-14178 > Project: Hadoop Common > Issue Type: Sub-task > Components: test >Affects Versions: 2.9.0 >Reporter: Steve Loughran >Assignee: Akira Ajisaka > Attachments: HADOOP-14178.001.patch, HADOOP-14178.002.patch, > HADOOP-14178.003.patch, HADOOP-14178.004.patch, HADOOP-14178.005.patch > > > I don't know when Hadoop picked up Mockito, but it has been frozen at 1.8.5 > since the switch to maven in 2011. > Mockito is now at version 2.1, [with lots of Java 8 > support|https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2]. > That' s not just defining actions as closures, but in supporting Optional > types, mocking methods in interfaces, etc. > It's only used for testing, and, *provided there aren't regressions*, cost of > upgrade is low. The good news: test tools usually come with good test > coverage. The bad: mockito does go deep into java bytecodes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14950) har file system throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HADOOP-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu-Tang Lin updated HADOOP-14950: - Status: Open (was: Patch Available) > har file system throws ArrayIndexOutOfBoundsException > - > > Key: HADOOP-14950 > URL: https://issues.apache.org/jira/browse/HADOOP-14950 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0 > Environment: CDH 5.9.2 >Reporter: Wei-Chiu Chuang >Assignee: Yu-Tang Lin > Labels: newbie > Attachments: HADOOP-14950-branch-3.0.001.patch, > HADOOP-14950-branch-3.0.002.patch > > > When listing a har file system file, it throws an AIOOBE like the following: > {noformat} > $ hdfs dfs -ls har:///abc.har > -ls: Fatal internal error > java.lang.ArrayIndexOutOfBoundsException: 1 > at org.apache.hadoop.fs.HarFileSystem$HarStatus.(HarFileSystem.java:597) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.parseMetaData(HarFileSystem.java:1201) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.access$000(HarFileSystem.java:1098) > at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:166) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2711) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:382) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325) > at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235) > at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:102) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > {noformat} > Checking the code, it looks like the _index file in the har is mal-formed. It > expects two string separately by a space in each line, and this AIOOBE is > possible if the second string does not exist. > File this jira to improve the error handling of such case. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14950) har file system throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HADOOP-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu-Tang Lin updated HADOOP-14950: - Attachment: HADOOP-14950-branch-3.0.002.patch > har file system throws ArrayIndexOutOfBoundsException > - > > Key: HADOOP-14950 > URL: https://issues.apache.org/jira/browse/HADOOP-14950 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0 > Environment: CDH 5.9.2 >Reporter: Wei-Chiu Chuang >Assignee: Yu-Tang Lin > Labels: newbie > Attachments: HADOOP-14950-branch-3.0.001.patch, > HADOOP-14950-branch-3.0.002.patch > > > When listing a har file system file, it throws an AIOOBE like the following: > {noformat} > $ hdfs dfs -ls har:///abc.har > -ls: Fatal internal error > java.lang.ArrayIndexOutOfBoundsException: 1 > at org.apache.hadoop.fs.HarFileSystem$HarStatus.(HarFileSystem.java:597) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.parseMetaData(HarFileSystem.java:1201) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.access$000(HarFileSystem.java:1098) > at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:166) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2711) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:382) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325) > at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235) > at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:102) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > {noformat} > Checking the code, it looks like the _index file in the har is mal-formed. It > expects two string separately by a space in each line, and this AIOOBE is > possible if the second string does not exist. > File this jira to improve the error handling of such case. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14950) har file system throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HADOOP-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu-Tang Lin updated HADOOP-14950: - Status: Patch Available (was: Open) > har file system throws ArrayIndexOutOfBoundsException > - > > Key: HADOOP-14950 > URL: https://issues.apache.org/jira/browse/HADOOP-14950 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0 > Environment: CDH 5.9.2 >Reporter: Wei-Chiu Chuang >Assignee: Yu-Tang Lin > Labels: newbie > Attachments: HADOOP-14950-branch-3.0.001.patch, > HADOOP-14950-branch-3.0.002.patch > > > When listing a har file system file, it throws an AIOOBE like the following: > {noformat} > $ hdfs dfs -ls har:///abc.har > -ls: Fatal internal error > java.lang.ArrayIndexOutOfBoundsException: 1 > at org.apache.hadoop.fs.HarFileSystem$HarStatus.(HarFileSystem.java:597) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.parseMetaData(HarFileSystem.java:1201) > at > org.apache.hadoop.fs.HarFileSystem$HarMetaData.access$000(HarFileSystem.java:1098) > at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:166) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2711) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:382) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) > at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325) > at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235) > at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218) > at > org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:102) > at org.apache.hadoop.fs.shell.Command.run(Command.java:165) > at org.apache.hadoop.fs.FsShell.run(FsShell.java:315) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:372) > {noformat} > Checking the code, it looks like the _index file in the har is mal-formed. It > expects two string separately by a space in each line, and this AIOOBE is > possible if the second string does not exist. > File this jira to improve the error handling of such case. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14950) har file system throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HADOOP-14950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216375#comment-16216375 ] Hadoop QA commented on HADOOP-14950: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} branch-3.0 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 33s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 29s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 15s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} branch-3.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 42s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 34s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch generated 12 new + 71 unchanged - 1 fixed = 83 total (was 72) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 7 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 7 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 7m 37s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 42s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 29s{color} | {color:red} The patch generated 2 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 73m 15s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:612350e | | JIRA Issue | HADOOP-14950 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12893631/HADOOP-14950-branch-3.0.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a6448eacf30c 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.0 / 1e7ea66 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HADOOP-Build/13566/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt | | whitespace | https://builds.apache.org/job/PreCommit-HADOOP-Build/13566/artifact/patchprocess/whitespace-eol.txt | |