[jira] [Created] (HADOOP-16286) Add opentracing into KMS server/client
Wei-Chiu Chuang created HADOOP-16286: Summary: Add opentracing into KMS server/client Key: HADOOP-16286 URL: https://issues.apache.org/jira/browse/HADOOP-16286 Project: Hadoop Common Issue Type: Sub-task Reporter: Wei-Chiu Chuang Recently, while I worked on KMS-o-meter tool, a workload replay tool for KMS, I added opentracing in it and in KMS server/client.(courtesy of Elek, Marton's tracing work in ozone, I simply took a part of it). The ability to understand system performance in such details is unprecedented. We were able to fix a system scalability bug, verify it in record time. File this jira to contribute the opentracing code in KMS server/client back to the community. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-16285) Migrate opentracing utility code from Ozone to Hadoop Common
Wei-Chiu Chuang created HADOOP-16285: Summary: Migrate opentracing utility code from Ozone to Hadoop Common Key: HADOOP-16285 URL: https://issues.apache.org/jira/browse/HADOOP-16285 Project: Hadoop Common Issue Type: Sub-task Components: tracing Reporter: Wei-Chiu Chuang Assignee: Wei-Chiu Chuang [~elek] created a few utility classes for Opentracing in Ozone, and the code has been tested for awhile. In order for the Hadoop applications to start adopting Opentracing, we should migrate the code into Hadoop Common, instead of reinventing the wheel. Additionally, add corresponding dependencies in Maven pom.xml. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
maven-surefire-plugin configuration for Hadoop 2.8.5
Hi all, I'm running unit tests for Hadoop 2.8.5. I had a script that basically repeatedly run command mvn -Dtest=t1,t2,t3,... test with different test selections. And after each test command finishes, I need to get test running stats for other purposes. The problem is that I saw lingering java processes running surefirebooterx.jar that consumes a lot of memory and CPU resources. Is this owing to setting to a timeout value under maven-surefire-plugin configuration (in hadoop-project/pom.xml)? If so, I'm wondering if there are specific reasons (e.g. required by certain unit tests) for using a timeout value and delay the killing of forked JVMs, instead of doing something like exit? I changed my pom configuration to just using shutdown exit and I didn't see lingering java surefirebooter processes anymore, but I'm not sure if this change would break any Hadoop unit test. Any thoughts on this? Thanks a lot in advance. Best, Elaine
Re: Not getting JIRA email on cc: or tag
I got it sorted thanks. FWIW your JIRA email address setting doesn't appear until you click the pencil (edit) under the Details tab of your JIRA profile page. Sorry for the wide distribution here. -AF On Wed, May 1, 2019 at 3:30 AM Steve Loughran wrote: > 1. you checked your JIRA notification settings? > 2. is it going to the right email address? > > On Wed, May 1, 2019 at 5:30 AM Aaron Fabbri wrote: > >> Hello, >> >> I haven't been getting emailed when someone tags or cc:s me in a JIRA. Is >> there a way to change this? >> >> Thanks! >> Aaron >> >
[jira] [Created] (HADOOP-16284) KMS Cache Miss Storm
Wei-Chiu Chuang created HADOOP-16284: Summary: KMS Cache Miss Storm Key: HADOOP-16284 URL: https://issues.apache.org/jira/browse/HADOOP-16284 Project: Hadoop Common Issue Type: Bug Components: kms Affects Versions: 2.6.0 Environment: CDH 5.13.1, Kerberized, Cloudera Keytrustee Server Reporter: Wei-Chiu Chuang We recently stumble upon a performance issue with KMS, where occasionally it exhibited "No content to map" error (this cluster ran an old version that doesn't have HADOOP-14841) and jobs crashed. *We bumped the number of KMSes from 2 to 4, and situation went even worse.* Later, we realized this cluster had a few hundred encryption zones and a few hundred encryption keys. This is pretty unusual because most of the deployments known to us has at most a dozen keys. So in terms of number of keys, this cluster is 1-2 order of magnitude higher than any one else. The high number of encryption keys in creases the likelihood of key cache miss in KMS. In Cloudera's setup, each cache miss forces KMS to sync with its backend, the Cloudera Keytrustee Server. Plus the high number of KMSes amplifies the latency, effectively causing a [cache miss storm|https://en.wikipedia.org/wiki/Cache_stampede]. We were able to reproduce this issue with KMS-o-meter (HDFS-14312) - I will come up with a better name later surely - and discovered a scalability bug in CKTS. The fix was verified again with the tool. Filing this bug so the community is aware of this issue. I don't have a solution for now in KMS. But we want to address this scalability problem in the near future because we are seeing use cases that requires thousands of encryption keys. On a side note, 4 KMS doesn't work well without HADOOP-14445 (and subsequent fixes). A MapReduce job acquires at most 3 KMS delegation tokens, and so for cases, such as distcp, it wouldn fail to reach the 4th KMS on the remote cluster. I imagine similar issues exist for other execution engines, but I didn't test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1123/ [Apr 30, 2019 2:46:20 AM] (ajay) Revert "HDDS-973. HDDS/Ozone fail to build on Windows." [Apr 30, 2019 2:54:25 AM] (ztang) SUBMARINE-64. Improve TonY runtime's document. Contributed by Keqiu Hu. [Apr 30, 2019 3:06:44 AM] (ztang) YARN-9476. [YARN-9473] Create unit tests for VE plugin. Contributed by [Apr 30, 2019 10:53:26 AM] (stevel) HADOOP-16221. S3Guard: add option to fail operation on metadata write [Apr 30, 2019 12:27:39 PM] (elek) HDDS-1384. TestBlockOutputStreamWithFailures is failing [Apr 30, 2019 9:04:59 PM] (eyang) YARN-6929. Improved partition algorithm for yarn remote-app-log-dir. [Apr 30, 2019 9:52:16 PM] (todd) HDFS-3246: pRead equivalent for direct read path (#597) -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-documentstore Unread field:TimelineEventSubDoc.java:[line 56] Unread field:TimelineMetricSubDoc.java:[line 44] FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core Class org.apache.hadoop.applications.mawo.server.common.TaskStatus implements Cloneable but does not define or use clone method At TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 39-346] Equals method for org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument is of type WorkerId At WorkerId.java:the argument is of type WorkerId At WorkerId.java:[line 114] org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does not check for null argument At WorkerId.java:null argument At WorkerId.java:[lines 114-115] Failed junit tests : hadoop.hdfs.server.datanode.TestBPOfferService hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 hadoop.yarn.client.cli.TestLogsCLI hadoop.yarn.applications.distributedshell.TestDistributedShell hadoop.mapreduce.v2.app.TestRuntimeEstimators hadoop.hdds.scm.container.TestContainerStateManagerIntegration hadoop.ozone.scm.pipeline.TestPipelineManagerMXBean hadoop.hdds.scm.safemode.TestSCMSafeModeWithPipelineRules hadoop.ozone.om.TestMultipleContainerReadWrite hadoop.ozone.client.rpc.TestContainerStateMachineFailures hadoop.ozone.web.client.TestBuckets hadoop.ozone.scm.TestContainerSmallFile hadoop.ozone.TestStorageContainerManager hadoop.ozone.client.rpc.TestBCSID hadoop.ozone.ozShell.TestOzoneDatanodeShell hadoop.ozone.client.rpc.TestBlockOutputStream hadoop.ozone.scm.TestXceiverClientMetrics hadoop.ozone.om.TestOmAcls hadoop.ozone.om.TestOzoneManager hadoop.ozone.client.rpc.TestCommitWatcher hadoop.ozone.web.client.TestKeys hadoop.ozone.container.common.statemachine.commandhandler.TestCloseContainerHandler hadoop.ozone.client.rpc.TestOzoneRpcClient hadoop.ozone.client.rpc.TestContainerStateMachine hadoop.ozone.container.TestContainerReplication hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException hadoop.ozone.om.TestScmSafeMode hadoop.ozone.om.TestOMDbCheckpointServlet hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient hadoop.ozone.om.TestOzoneManagerHA hadoop.ozone.om.TestOmInit hadoop.ozone.om.TestOmBlockVersioning hadoop.ozone.om.TestOzoneManagerRestInterface hadoop.ozone.scm.TestAllocateContainer hadoop.hdds.scm.pipeline.TestPipelineClose hadoop.ozone.ozShell.TestS3Shell hadoop.hdds.scm.pipeline.TestNodeFailure hadoop.fs.ozone.contract.ITestOzoneContractRename hadoop.fs.ozone.contract.ITestOzoneContractRootDir hadoop.fs.ozone.contract.ITestOzoneContractMkdir hadoop.fs.ozone.contract.ITestOzoneContractSeek hadoop.fs.ozone.contract.ITestOzoneContractOpen hadoop.fs.ozone.contract.ITestOzoneContractDelete hadoop.fs.ozone.contract.ITestOzoneContractDistCp hadoop.fs.ozone.contract.ITestOzoneContractCreate hadoop.ozone.freon.TestFreonWithDatanodeFastRestart hadoop.ozone.freon.TestRandomKeyGenerator hadoop.ozone.freon.TestFreonWithPipelineDestroy hadoop.ozone.freon.TestDataValidateWithUnsafeByteOperations
Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/ No changes -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml FindBugs : module:hadoop-common-project/hadoop-common Class org.apache.hadoop.fs.GlobalStorageStatistics defines non-transient non-serializable instance field map In GlobalStorageStatistics.java:instance field map In GlobalStorageStatistics.java FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client Boxed value is unboxed and then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:[line 335] Failed junit tests : hadoop.security.TestShellBasedUnixGroupsMapping hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys hadoop.hdfs.TestHFlush hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.hdfs.TestParallelShortCircuitRead hadoop.hdfs.TestFileCreation hadoop.hdfs.server.datanode.TestDirectoryScanner hadoop.fs.viewfs.TestViewFileSystemAtHdfsRoot hadoop.hdfs.TestDFSStartupVersions hadoop.hdfs.TestRollingUpgradeDowngrade hadoop.hdfs.TestBlockStoragePolicy hadoop.hdfs.TestDFSClientSocketSize hadoop.cli.TestHDFSCLI hadoop.hdfs.TestIsMethodSupported hadoop.yarn.client.api.impl.TestAMRMProxy hadoop.registry.secure.TestSecureLogins hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt [328K] cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-compile-cc-root-jdk1.8.0_191.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-compile-javac-root-jdk1.8.0_191.txt [308K] checkstyle: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-checkstyle-root.txt [16M] hadolint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-patch-hadolint.txt [4.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-patch-shellcheck.txt [72K] shelldocs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-patch-shelldocs.txt [8.0K] whitespace: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/whitespace-eol.txt [12M] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/whitespace-tabs.txt [1.2M] xml: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/xml.txt [12K] findbugs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html [8.0K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html [8.0K] javadoc:
Re: Not getting JIRA email on cc: or tag
1. you checked your JIRA notification settings? 2. is it going to the right email address? On Wed, May 1, 2019 at 5:30 AM Aaron Fabbri wrote: > Hello, > > I haven't been getting emailed when someone tags or cc:s me in a JIRA. Is > there a way to change this? > > Thanks! > Aaron >
[jira] [Created] (HADOOP-16283) Error in reading Kerberos principals from the Keytab file
Farhan Khan created HADOOP-16283: Summary: Error in reading Kerberos principals from the Keytab file Key: HADOOP-16283 URL: https://issues.apache.org/jira/browse/HADOOP-16283 Project: Hadoop Common Issue Type: Bug Reporter: Farhan Khan The error refers to the launching of Namenode daemon when Kerberos is used for authentication. While reading Spnego principals (HTTP/.*) from the keytab file to start the Jetty server, KerberosUtil throws an error: {code:java} javax.servlet.ServletException: java.io.IOException: Unexpected octets len: 16716 at org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.init(KerberosAuthenticationHandler.java:188) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.initializeAuthHandler(AuthenticationFilter.java:194) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.init(AuthenticationFilter.java:180) at org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:139) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:873) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:349) at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1406) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1368) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:778) at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:262) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:522) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:113) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131) at org.eclipse.jetty.server.Server.start(Server.java:427) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:105) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61) at org.eclipse.jetty.server.Server.doStart(Server.java:394) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:1140) at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:177) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:872) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:940) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:913) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1646) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1713) Caused by: java.io.IOException: Unexpected octets len: 16716 at org.apache.kerby.kerberos.kerb.KrbInputStream.readCountedOctets(KrbInputStream.java:72) at org.apache.kerby.kerberos.kerb.KrbInputStream.readKey(KrbInputStream.java:48) at org.apache.kerby.kerberos.kerb.keytab.KeytabEntry.load(KeytabEntry.java:55) at org.apache.kerby.kerberos.kerb.keytab.Keytab.readEntry(Keytab.java:203) at org.apache.kerby.kerberos.kerb.keytab.Keytab.readEntries(Keytab.java:189) at org.apache.kerby.kerberos.kerb.keytab.Keytab.doLoad(Keytab.java:161) at org.apache.kerby.kerberos.kerb.keytab.Keytab.load(Keytab.java:155) at org.apache.kerby.kerberos.kerb.keytab.Keytab.load(Keytab.java:143) at org.apache.kerby.kerberos.kerb.keytab.Keytab.loadKeytab(Keytab.java:55) at org.apache.hadoop.security.authentication.util.KerberosUtil.getPrincipalNames(KerberosUtil.java:225) at org.apache.hadoop.security.authentication.util.KerberosUtil.getPrincipalNames(KerberosUtil.java:244) at org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.init(KerberosAuthenticationHandler.java:152) ... 29 more {code} The main problem is with reading of keytab file generated by heimdal-kdc version 7.5.0. Keytab class of package org.apache.kerby.kerberos.kerb.keytab deals with reading entries from keytab file. This is the format of a keytab file. {code:java} keytab { uint16_t file_format_version; # 0x502 keytab_entry entries[*]; }; keytab_entry { int32_t size; uint16_t num_components; # subtract 1 if version