Apache Hadoop qbt Report: trunk+JDK8 on Linux/ppc64le
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/105/ [Sep 24, 2016 5:51:03 PM] (naganarasimha_gr) YARN-5664. Fix Yarn documentation to link to correct versions. [Sep 25, 2016 5:32:46 AM] (brahma) HDFS-10869. Remove the unused method InodeId#checkId(). Contributed by [Sep 25, 2016 12:06:30 PM] (naganarasimha_gr) YARN-3877. YarnClientImpl.submitApplication swallows exceptions. -1 overall The following subsystems voted -1: compile unit The following subsystems voted -1 but were configured to be filtered/ignored: cc javac The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: Failed junit tests : hadoop.ipc.TestRPCWaitForProxy hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.hdfs.TestCrcCorruption hadoop.yarn.server.nodemanager.recovery.TestNMLeveldbStateStoreService hadoop.yarn.server.nodemanager.TestNodeManagerShutdown hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager hadoop.yarn.server.timeline.TestRollingLevelDB hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices hadoop.yarn.server.timeline.TestTimelineDataManager hadoop.yarn.server.timeline.TestLeveldbTimelineStore hadoop.yarn.server.timeline.recovery.TestLeveldbTimelineStateStore hadoop.yarn.server.timeline.TestRollingLevelDBTimelineStore hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer hadoop.yarn.server.timelineservice.storage.common.TestRowKeys hadoop.yarn.server.timelineservice.storage.common.TestKeyConverters hadoop.yarn.server.timelineservice.storage.common.TestSeparator hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer hadoop.yarn.server.resourcemanager.recovery.TestLeveldbRMStateStore hadoop.yarn.server.resourcemanager.TestRMRestart hadoop.yarn.server.resourcemanager.TestResourceTrackerService hadoop.yarn.server.TestMiniYarnClusterNodeUtilization hadoop.yarn.server.TestContainerManagerSecurity hadoop.yarn.client.api.impl.TestNMClient hadoop.yarn.server.timeline.TestLevelDBCacheTimelineStore hadoop.yarn.server.timeline.TestOverrideTimelineStoreYarnClient hadoop.yarn.server.timeline.TestEntityGroupFSTimelineStore hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorage hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun hadoop.yarn.server.timelineservice.storage.TestPhoenixOfflineAggregationWriterImpl hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity hadoop.yarn.applications.distributedshell.TestDistributedShell hadoop.mapred.TestShuffleHandler hadoop.mapreduce.v2.hs.TestHistoryServerLeveldbStateStoreService Timed out junit tests : org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache org.apache.hadoop.mapred.TestMRIntermediateDataEncryption org.apache.hadoop.mapred.TestMROpportunisticMaps compile: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/105/artifact/out/patch-compile-root.txt [308K] cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/105/artifact/out/patch-compile-root.txt [308K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/105/artifact/out/patch-compile-root.txt [308K] unit: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/105/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt [120K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/105/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [440K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/105/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt [56K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/105/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt [52K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/105/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice.txt [20K]
[jira] [Created] (HADOOP-13655) document object store use with fs shell and distcp
Steve Loughran created HADOOP-13655: --- Summary: document object store use with fs shell and distcp Key: HADOOP-13655 URL: https://issues.apache.org/jira/browse/HADOOP-13655 Project: Hadoop Common Issue Type: Sub-task Components: documentation, fs, fs/s3 Affects Versions: 2.7.3 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor There's no specific docs for working with object stores from the {{hadoop fs}} shell or in distcp; people either suffer from this (performance, billing), or learn through trial and error what to do. Add a section in both fs shell and distcp docs covering use with object stores. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-13654) S3A create() to support asynchronous check of dest & parent paths
Steve Loughran created HADOOP-13654: --- Summary: S3A create() to support asynchronous check of dest & parent paths Key: HADOOP-13654 URL: https://issues.apache.org/jira/browse/HADOOP-13654 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 2.7.3 Reporter: Steve Loughran One source of delays in S3A is the need to check if a destination path exists in create; this makes sure the operation isn't trying to overwrite a directory. #. This is slow, 1-4 HTTPS requests # The code doesn't seem to check the entire parent path to make sure there isn't a file as a parent (which raises the question: shouldn't we have a contract test for this?) # Even with the create overwrite=false check, the fact that the new object isn't created until the output stream is close()'d, means that the check has race conditions. Instead of doing a synchronous check in create(), we could do an asynchronous check of the parent directory tree. If any error surfaced, this could be cached and then thrown on the next call to: write(), flush() or close(); that is, the failure of a create due to path problems would not surface immediately on the create() call, *but before any writes were committed*. The full directory tree can/should be checked, and is results remembered. This would allow for the post-commit cleanup to issue delete() requests purely for those paths (if any) which referred to directories. As well as the need to use the AWS thread pool, there's a bit of complexity with cancelling multipart uploads: the output stream needs to know that the request failed, and that the multipart should be aborted. If the complexity of the asynchronous calls can be coped with, *and client code happy to accept errors in the any IO call to the output stream*, then the initial overhead at file creation could be skipped. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/ [Sep 24, 2016 5:51:03 PM] (naganarasimha_gr) YARN-5664. Fix Yarn documentation to link to correct versions. [Sep 25, 2016 5:32:46 AM] (brahma) HDFS-10869. Remove the unused method InodeId#checkId(). Contributed by -1 overall The following subsystems voted -1: asflicense unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: Failed junit tests : hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices hadoop.yarn.server.TestMiniYarnClusterNodeUtilization hadoop.yarn.server.TestContainerManagerSecurity cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/diff-compile-javac-root.txt [168K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/diff-checkstyle-root.txt [16M] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/diff-patch-pylint.txt [16K] shellcheck: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/diff-patch-shellcheck.txt [20K] shelldocs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/diff-patch-shelldocs.txt [16K] whitespace: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/whitespace-eol.txt [11M] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/whitespace-tabs.txt [1.3M] javadoc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/diff-javadoc-javadoc-root.txt [2.2M] unit: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt [12K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt [268K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-nativetask.txt [124K] asflicense: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/175/artifact/out/patch-asflicense-problems.txt [4.0K] Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-13653) ZKDelegationTokenSecretManager curator client seems to rapidly connect & disconnect from ZK
Alex Ivanov created HADOOP-13653: Summary: ZKDelegationTokenSecretManager curator client seems to rapidly connect & disconnect from ZK Key: HADOOP-13653 URL: https://issues.apache.org/jira/browse/HADOOP-13653 Project: Hadoop Common Issue Type: Bug Components: kms Reporter: Alex Ivanov Priority: Critical During periods of time, KMS gets in a connect/disconnect loop from Zookeeper. It is not clear what causes the connection to be closed. I didn't see any issues on the ZK server side, so the issue must reside on client side. {{Example errors}} NOTE: I had to filter the logs heavily since they were many GB in size. What is left is an illustration of the delegation token creations, and the Zookeeper sessions getting lost and re-established over the course of 2 hours. {code} 2016-09-25 01:43:04,377 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [75027a21ab399aa7789d6907d70fadc4, 46] 2016-09-25 01:43:04,557 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [1106d0754d43dcf29324d7be737f51f0, 46] 2016-09-25 01:43:11,846 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [4426092c861f49c6ba0c60b49b9539e5, 46] 2016-09-25 01:43:48,974 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [a99efff2705d6489deb059098f18818f, 46] 2016-09-25 01:43:49,174 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [398b5962fd647880961ba5e86a77b414, 46] 2016-09-25 01:44:03,359 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [413187e62a21b5459422b5c524315d06, 46] 2016-09-25 01:44:03,625 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [7cc2c0d82edd40e7e6f6f40af20d04d3, 46] 2016-09-25 01:44:06,062 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [bd9394fce20607c12bc00104bea49284, 46] 2016-09-25 01:44:07,134 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [7dad3bd10526517e5e1cfccd2e96074a, 46] 2016-09-25 01:44:07,230 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [a712ed40687580647d070c9c7f525e15, 46] 2016-09-25 01:44:48,481 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [44bfefa31192c68e3cc053eec4e57e14, 46] 2016-09-25 01:44:48,522 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [67efc2aa65eeba701ad7d3d7bab51def, 46] 2016-09-25 01:44:50,259 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [b43e641f58dfbd2c72550ab6804f37d1, 46] 2016-09-25 01:44:54,271 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [ac2fbcf404c633759b75e6d6aae00e05, 46] 2016-09-25 01:44:56,141 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [cdbd224079a4a10400d00d0b8eece008, 46] 2016-09-25 01:45:01,328 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [e03218f4835524f3d05519d27bb04e35, 46] 2016-09-25 01:45:02,728 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [569ae6d666d584b6843fffc47a63d147, 46] 2016-09-25 01:45:02,832 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [c9048271483da234c12f75569b9513c6, 46] 2016-09-25 01:45:05,536 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [f519d621389e41b63e8d92b4cb15f832, 46] 2016-09-25 01:45:07,886 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [45cf6ba58b2bb348ac5e88fa18fe9dad, 46] 2016-09-25 01:47:24,346 WARN ConnectionState - Connection attempt unsuccessful after 66294 (greater than max timeout of 6). Resetting connection and trying again with a new connection. 2016-09-25 01:47:25,120 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [f160a865db69ef33548f146c9b3b84c6, 46] 2016-09-25 01:47:25,276 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [9d60add471464e01ef691c43bd901d96, 46] 2016-09-25 01:47:28,739 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [659ecabf02ff809736202a8484ff2be8, 46] 2016-09-25 01:48:33,233 WARN ConnectionState - Connection attempt unsuccessful after 64494 (greater than max timeout of 6). Resetting connection and trying again with a new connection. 2016-09-25 01:48:33,306 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [15b87dd6c2251177d8db9dda415d0e06, 46] 2016-09-25 01:48:33,459 INFO AbstractDelegationTokenSecretManager - Creating password for identifier: [306aa796017aab2b559bf503f81175e0, 46] 2016-09-25 01:48:34,665 INFO AbstractDelegationTokenSecretManager - Creating password for identifier:
[jira] [Created] (HADOOP-13652) ZKDelegationTokenSecretManager doesn't seem to honor ZK connection/session timeouts
Alex Ivanov created HADOOP-13652: Summary: ZKDelegationTokenSecretManager doesn't seem to honor ZK connection/session timeouts Key: HADOOP-13652 URL: https://issues.apache.org/jira/browse/HADOOP-13652 Project: Hadoop Common Issue Type: Bug Components: kms Reporter: Alex Ivanov Looking at some of the errors I've seen due to Zookeeper connection issues from KMS, it doesn't seem like the following timeouts are picked up. {code} package org.apache.hadoop.security.token.delegation; public abstract class ZKDelegationTokenSecretManager extends AbstractDelegationTokenSecretManager { public static final int ZK_DTSM_ZK_SESSION_TIMEOUT_DEFAULT = 1; public static final int ZK_DTSM_ZK_CONNECTION_TIMEOUT_DEFAULT = 1; ... } {code} Instead, the connection/session timeouts are, correspondingly, 15 & 60 secs: the curator defaults. {code} package org.apache.curator.framework; public class CuratorFrameworkFactory { private static final int DEFAULT_SESSION_TIMEOUT_MS = Integer.getInteger("curator-default-session-timeout", 60 * 1000); private static final int DEFAULT_CONNECTION_TIMEOUT_MS = Integer.getInteger("curator-default-connection-timeout", 15 * 1000); ... } {code} It looks like DelegationTokenAuthenticationFilter is setting curator, and that may cause an issue: {code} package org.apache.hadoop.security.token.delegation.web; public class DelegationTokenAuthenticationFilter extends AuthenticationFilter { protected void initializeAuthHandler(String authHandlerClassName, FilterConfig filterConfig) throws ServletException { ZKDelegationTokenSecretManager.setCurator((CuratorFramework) filterConfig.getServletContext().getAttribute(ZKSignerSecretProvider. ZOOKEEPER_SIGNER_SECRET_PROVIDER_CURATOR_CLIENT_ATTRIBUTE)); super.initializeAuthHandler(authHandlerClassName, filterConfig); ZKDelegationTokenSecretManager.setCurator(null); } {code} Example errors: {code} 2016-09-25 01:46:33,053 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (15001) 2016-09-25 01:46:33,053 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (15001) 2016-09-25 01:46:34,028 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (15976) 2016-09-25 01:46:34,053 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (16001) 2016-09-25 01:46:37,053 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (19001) 2016-09-25 01:46:40,053 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (22001) 2016-09-25 01:46:49,055 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (31003) 2016-09-25 01:46:52,029 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (33977) 2016-09-25 01:47:05,344 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (47292) 2016-09-25 01:47:09,345 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (51292) 2016-09-25 01:47:24,346 WARN ConnectionState - Connection attempt unsuccessful after 66294 (greater than max timeout of 6). Resetting connection and trying again with a new connection. 2016-09-25 01:47:43,740 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (15001) 2016-09-25 01:47:43,740 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (15001) {code} There are also some connections issues between KMS and Zookeeper. It is sporadic, that's why I'm still trying to pinpoint them, but essentially KMS can get into this perpetual connect/disconnect cycle from which it eventually recovers or a restart also helps. I'm mentioning this fact in case it is related to this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org