[jira] [Created] (HADOOP-16286) Add opentracing into KMS server/client

2019-05-01 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HADOOP-16286:


 Summary: Add opentracing into KMS server/client
 Key: HADOOP-16286
 URL: https://issues.apache.org/jira/browse/HADOOP-16286
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Wei-Chiu Chuang


Recently, while I worked on KMS-o-meter tool, a workload replay tool for KMS, I 
added opentracing in it and in KMS server/client.(courtesy of Elek, Marton's 
tracing work in ozone, I simply took a part of it). The ability to understand 
system performance in such details is unprecedented. We were able to fix a 
system scalability bug, verify it in record time.

File this jira to contribute the opentracing code in KMS server/client back to 
the community.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16285) Migrate opentracing utility code from Ozone to Hadoop Common

2019-05-01 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HADOOP-16285:


 Summary: Migrate opentracing utility code from Ozone to Hadoop 
Common
 Key: HADOOP-16285
 URL: https://issues.apache.org/jira/browse/HADOOP-16285
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: tracing
Reporter: Wei-Chiu Chuang
Assignee: Wei-Chiu Chuang


[~elek] created a few utility classes for Opentracing in Ozone, and the code 
has been tested for awhile. In order for the Hadoop applications to start 
adopting Opentracing, we should migrate the code into Hadoop Common, instead of 
reinventing the wheel.

Additionally, add corresponding dependencies in Maven pom.xml.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



maven-surefire-plugin configuration for Hadoop 2.8.5

2019-05-01 Thread Elaine Ang
Hi all,

I'm running unit tests for Hadoop 2.8.5. I had a script that basically
repeatedly run command
   mvn -Dtest=t1,t2,t3,... test
with different test selections. And after each test command finishes, I
need to get test running stats for other purposes.

The problem is that I saw lingering java processes running
surefirebooterx.jar that consumes a lot of memory and CPU resources. Is
this owing to setting   to a timeout value
under maven-surefire-plugin configuration (in hadoop-project/pom.xml)?

If so, I'm wondering if there are specific reasons (e.g. required by
certain unit tests) for using a timeout value and delay the killing of
forked JVMs, instead of doing something like exit?
I changed my pom configuration to just using shutdown exit and I didn't see
lingering java surefirebooter processes anymore, but I'm not sure if this
change would break any Hadoop unit test.

Any thoughts on this? Thanks a lot in advance.

Best,
Elaine


Re: Not getting JIRA email on cc: or tag

2019-05-01 Thread Aaron Fabbri
I got it sorted thanks. FWIW your JIRA email address setting doesn't appear
until you click the pencil (edit) under the Details tab of your JIRA
profile page.

Sorry for the wide distribution here.

-AF


On Wed, May 1, 2019 at 3:30 AM Steve Loughran  wrote:

> 1. you checked your JIRA notification settings?
> 2. is it going to the right email address?
>
> On Wed, May 1, 2019 at 5:30 AM Aaron Fabbri  wrote:
>
>> Hello,
>>
>> I haven't been getting emailed when someone tags or cc:s me in a JIRA. Is
>> there a way to change this?
>>
>> Thanks!
>> Aaron
>>
>


[jira] [Created] (HADOOP-16284) KMS Cache Miss Storm

2019-05-01 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HADOOP-16284:


 Summary: KMS Cache Miss Storm
 Key: HADOOP-16284
 URL: https://issues.apache.org/jira/browse/HADOOP-16284
 Project: Hadoop Common
  Issue Type: Bug
  Components: kms
Affects Versions: 2.6.0
 Environment: CDH 5.13.1, Kerberized, Cloudera Keytrustee Server
Reporter: Wei-Chiu Chuang


We recently stumble upon a performance issue with KMS, where occasionally it 
exhibited "No content to map" error (this cluster ran an old version that 
doesn't have HADOOP-14841) and jobs crashed. *We bumped the number of KMSes 
from 2 to 4, and situation went even worse.*

Later, we realized this cluster had a few hundred encryption zones and a few 
hundred encryption keys. This is pretty unusual because most of the deployments 
known to us has at most a dozen keys. So in terms of number of keys, this 
cluster is 1-2 order of magnitude higher than any one else.

The high number of encryption keys in creases the likelihood of key cache miss 
in KMS. In Cloudera's setup, each cache miss forces KMS to sync with its 
backend, the Cloudera Keytrustee Server. Plus the high number of KMSes 
amplifies the latency, effectively causing a [cache miss 
storm|https://en.wikipedia.org/wiki/Cache_stampede].

We were able to reproduce this issue with KMS-o-meter (HDFS-14312) - I will 
come up with a better name later surely - and discovered a scalability bug in 
CKTS. The fix was verified again with the tool.

Filing this bug so the community is aware of this issue. I don't have a 
solution for now in KMS. But we want to address this scalability problem in the 
near future because we are seeing use cases that requires thousands of 
encryption keys.

On a side note, 4 KMS doesn't work well without HADOOP-14445 (and subsequent 
fixes). A MapReduce job acquires at most 3 KMS delegation tokens, and so for 
cases, such as distcp, it wouldn fail to reach the 4th KMS on the remote 
cluster. I imagine similar issues exist for other execution engines, but I 
didn't test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2019-05-01 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1123/

[Apr 30, 2019 2:46:20 AM] (ajay) Revert "HDDS-973. HDDS/Ozone fail to build on 
Windows."
[Apr 30, 2019 2:54:25 AM] (ztang) SUBMARINE-64. Improve TonY runtime's 
document. Contributed by Keqiu Hu.
[Apr 30, 2019 3:06:44 AM] (ztang) YARN-9476. [YARN-9473] Create unit tests for 
VE plugin. Contributed by
[Apr 30, 2019 10:53:26 AM] (stevel) HADOOP-16221. S3Guard: add option to fail 
operation on metadata write
[Apr 30, 2019 12:27:39 PM] (elek) HDDS-1384. TestBlockOutputStreamWithFailures 
is failing
[Apr 30, 2019 9:04:59 PM] (eyang) YARN-6929.  Improved partition algorithm for 
yarn remote-app-log-dir.   
[Apr 30, 2019 9:52:16 PM] (todd) HDFS-3246: pRead equivalent for direct read 
path (#597)




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-documentstore
 
   Unread field:TimelineEventSubDoc.java:[line 56] 
   Unread field:TimelineMetricSubDoc.java:[line 44] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core
 
   Class org.apache.hadoop.applications.mawo.server.common.TaskStatus 
implements Cloneable but does not define or use clone method At 
TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 
39-346] 
   Equals method for 
org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument 
is of type WorkerId At WorkerId.java:the argument is of type WorkerId At 
WorkerId.java:[line 114] 
   
org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does 
not check for null argument At WorkerId.java:null argument At 
WorkerId.java:[lines 114-115] 

Failed junit tests :

   hadoop.hdfs.server.datanode.TestBPOfferService 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices 
   hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 
   hadoop.yarn.client.cli.TestLogsCLI 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
   hadoop.mapreduce.v2.app.TestRuntimeEstimators 
   hadoop.hdds.scm.container.TestContainerStateManagerIntegration 
   hadoop.ozone.scm.pipeline.TestPipelineManagerMXBean 
   hadoop.hdds.scm.safemode.TestSCMSafeModeWithPipelineRules 
   hadoop.ozone.om.TestMultipleContainerReadWrite 
   hadoop.ozone.client.rpc.TestContainerStateMachineFailures 
   hadoop.ozone.web.client.TestBuckets 
   hadoop.ozone.scm.TestContainerSmallFile 
   hadoop.ozone.TestStorageContainerManager 
   hadoop.ozone.client.rpc.TestBCSID 
   hadoop.ozone.ozShell.TestOzoneDatanodeShell 
   hadoop.ozone.client.rpc.TestBlockOutputStream 
   hadoop.ozone.scm.TestXceiverClientMetrics 
   hadoop.ozone.om.TestOmAcls 
   hadoop.ozone.om.TestOzoneManager 
   hadoop.ozone.client.rpc.TestCommitWatcher 
   hadoop.ozone.web.client.TestKeys 
   
hadoop.ozone.container.common.statemachine.commandhandler.TestCloseContainerHandler
 
   hadoop.ozone.client.rpc.TestOzoneRpcClient 
   hadoop.ozone.client.rpc.TestContainerStateMachine 
   hadoop.ozone.container.TestContainerReplication 
   hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException 
   hadoop.ozone.om.TestScmSafeMode 
   hadoop.ozone.om.TestOMDbCheckpointServlet 
   hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient 
   hadoop.ozone.om.TestOzoneManagerHA 
   hadoop.ozone.om.TestOmInit 
   hadoop.ozone.om.TestOmBlockVersioning 
   hadoop.ozone.om.TestOzoneManagerRestInterface 
   hadoop.ozone.scm.TestAllocateContainer 
   hadoop.hdds.scm.pipeline.TestPipelineClose 
   hadoop.ozone.ozShell.TestS3Shell 
   hadoop.hdds.scm.pipeline.TestNodeFailure 
   hadoop.fs.ozone.contract.ITestOzoneContractRename 
   hadoop.fs.ozone.contract.ITestOzoneContractRootDir 
   hadoop.fs.ozone.contract.ITestOzoneContractMkdir 
   hadoop.fs.ozone.contract.ITestOzoneContractSeek 
   hadoop.fs.ozone.contract.ITestOzoneContractOpen 
   hadoop.fs.ozone.contract.ITestOzoneContractDelete 
   hadoop.fs.ozone.contract.ITestOzoneContractDistCp 
   hadoop.fs.ozone.contract.ITestOzoneContractCreate 
   hadoop.ozone.freon.TestFreonWithDatanodeFastRestart 
   hadoop.ozone.freon.TestRandomKeyGenerator 
   hadoop.ozone.freon.TestFreonWithPipelineDestroy 
   hadoop.ozone.freon.TestDataValidateWithUnsafeByteOperations 
   

Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2019-05-01 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/

No changes




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   module:hadoop-common-project/hadoop-common 
   Class org.apache.hadoop.fs.GlobalStorageStatistics defines non-transient 
non-serializable instance field map In GlobalStorageStatistics.java:instance 
field map In GlobalStorageStatistics.java 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 
   Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:[line 335] 

Failed junit tests :

   hadoop.security.TestShellBasedUnixGroupsMapping 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.hdfs.TestHFlush 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.hdfs.TestParallelShortCircuitRead 
   hadoop.hdfs.TestFileCreation 
   hadoop.hdfs.server.datanode.TestDirectoryScanner 
   hadoop.fs.viewfs.TestViewFileSystemAtHdfsRoot 
   hadoop.hdfs.TestDFSStartupVersions 
   hadoop.hdfs.TestRollingUpgradeDowngrade 
   hadoop.hdfs.TestBlockStoragePolicy 
   hadoop.hdfs.TestDFSClientSocketSize 
   hadoop.cli.TestHDFSCLI 
   hadoop.hdfs.TestIsMethodSupported 
   hadoop.yarn.client.api.impl.TestAMRMProxy 
   hadoop.registry.secure.TestSecureLogins 
   hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt
  [328K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-compile-cc-root-jdk1.8.0_191.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-compile-javac-root-jdk1.8.0_191.txt
  [308K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-patch-shellcheck.txt
  [72K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/diff-patch-shelldocs.txt
  [8.0K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/whitespace-tabs.txt
  [1.2M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/xml.txt
  [12K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/308/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html
  [8.0K]

   javadoc:

   

Re: Not getting JIRA email on cc: or tag

2019-05-01 Thread Steve Loughran
1. you checked your JIRA notification settings?
2. is it going to the right email address?

On Wed, May 1, 2019 at 5:30 AM Aaron Fabbri  wrote:

> Hello,
>
> I haven't been getting emailed when someone tags or cc:s me in a JIRA. Is
> there a way to change this?
>
> Thanks!
> Aaron
>


[jira] [Created] (HADOOP-16283) Error in reading Kerberos principals from the Keytab file

2019-05-01 Thread Farhan Khan (JIRA)
Farhan Khan created HADOOP-16283:


 Summary: Error in reading Kerberos principals from the Keytab file
 Key: HADOOP-16283
 URL: https://issues.apache.org/jira/browse/HADOOP-16283
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Farhan Khan


The error refers to the launching of Namenode daemon when Kerberos is used for 
authentication. While reading Spnego principals (HTTP/.*) from the keytab file 
to start the Jetty server, KerberosUtil throws an error:
{code:java}
javax.servlet.ServletException: java.io.IOException: Unexpected octets len: 
16716
    at 
org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.init(KerberosAuthenticationHandler.java:188)
    at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.initializeAuthHandler(AuthenticationFilter.java:194)
    at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.init(AuthenticationFilter.java:180)
    at org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:139)
    at 
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:873)
    at 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:349)
    at 
org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1406)
    at 
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1368)
    at 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:778)
    at 
org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:262)
    at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:522)
    at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
    at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:113)
    at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61)
    at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
    at org.eclipse.jetty.server.Server.start(Server.java:427)
    at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:105)
    at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61)
    at org.eclipse.jetty.server.Server.doStart(Server.java:394)
    at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:1140)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:177)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:872)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:940)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:913)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1646)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1713)
Caused by: java.io.IOException: Unexpected octets len: 16716
    at 
org.apache.kerby.kerberos.kerb.KrbInputStream.readCountedOctets(KrbInputStream.java:72)
    at 
org.apache.kerby.kerberos.kerb.KrbInputStream.readKey(KrbInputStream.java:48)
    at 
org.apache.kerby.kerberos.kerb.keytab.KeytabEntry.load(KeytabEntry.java:55)
    at org.apache.kerby.kerberos.kerb.keytab.Keytab.readEntry(Keytab.java:203)
    at org.apache.kerby.kerberos.kerb.keytab.Keytab.readEntries(Keytab.java:189)
    at org.apache.kerby.kerberos.kerb.keytab.Keytab.doLoad(Keytab.java:161)
    at org.apache.kerby.kerberos.kerb.keytab.Keytab.load(Keytab.java:155)
    at org.apache.kerby.kerberos.kerb.keytab.Keytab.load(Keytab.java:143)
    at org.apache.kerby.kerberos.kerb.keytab.Keytab.loadKeytab(Keytab.java:55)
    at 
org.apache.hadoop.security.authentication.util.KerberosUtil.getPrincipalNames(KerberosUtil.java:225)
    at 
org.apache.hadoop.security.authentication.util.KerberosUtil.getPrincipalNames(KerberosUtil.java:244)
    at 
org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.init(KerberosAuthenticationHandler.java:152)
    ... 29 more
{code}
  The main problem is with reading of keytab file generated by heimdal-kdc 
version 7.5.0. Keytab class of package org.apache.kerby.kerberos.kerb.keytab 
deals with reading entries from keytab file. 

This is the format of a keytab file. 
{code:java}
keytab {
  uint16_t file_format_version;    # 0x502
  keytab_entry entries[*];
  };

  keytab_entry {
  int32_t size;
  uint16_t num_components;   # subtract 1 if version