mr-framework directory always empty

2016-11-08 Thread Benjamin Ross
I'm running into an infuriating issue using HDP 2.3.6.0-3796.  I've set 
mapreduce.application.framework.path appropriately, to 
/hdp/apps/2.3.6.0-3796/mapreduce/mapreduce.tar.gz#mr-framework, and set the 
mapreduce.application.classpath properly so it pulls in the localized 
mr-framework.  But for some reason, when localized, the mr-framework directory 
is always empty.

I've had to workaround it by copying the mr framework to all of the cluster 
nodes.

Is this a bug?  Is there a workaround for this?

Thanks,
Ben


This message has been scanned for malware by Websense. www.websense.com


allow all users to decrypt?

2016-11-08 Thread Benjamin Ross
All,
I'm in the process of configuring our system for hadoop encryption.  We're 
nearly complete - one of the last issues is that we have a build user that 
needs to decrypt data to read it from hdfs.  The issue is that the build user 
is an Active Directory user, so the username is DOMAIN\build, rather than just 
build.  I can't add this username to ranger because the ranger UI doesn't allow 
adding the \ character.

Ideally I would like all users to be able to encrypt and decrypt data from 
hdfs.  It just would make our lives a lot easier - it's explicitly what we want.

Is there any way to do this?  Alternatively, is there any way to add the user 
DOMAIN\build to ranger?

Worst case scenario, I can just modify the test to set HADOOP_USER_NAME to be 
build, but I'd prefer not to do that.

Thanks in advance,
Ben


This message has been scanned for malware by Websense. www.websense.com


RE: Authentication Failure talking to Ranger KMS

2016-10-11 Thread Benjamin Ross
Just for kicks I tried applying the patch in that ticket and it didn't have any 
effect.  It makes sense because my issue is on CREATE, and the ticket only has 
to do with OPEN.

Note that I don't have these issues using WebHDFS, only using httpfs, so it 
definitely seems like we're on the right track...

Thanks in advance,
Ben




From: Benjamin Ross
Sent: Tuesday, October 11, 2016 12:02 PM
To: Wei-Chiu Chuang
Cc: user@hadoop.apache.org; u...@ranger.incubator.apache.org
Subject: RE: Authentication Failure talking to Ranger KMS

That seems promising.  But shouldn't I be able to work around it by just 
ensuring that httpfs has all necessary privileges in the KMS service under 
Ranger?

Thanks,
Ben



From: Wei-Chiu Chuang [weic...@cloudera.com]
Sent: Tuesday, October 11, 2016 11:57 AM
To: Benjamin Ross
Cc: user@hadoop.apache.org; u...@ranger.incubator.apache.org
Subject: Re: Authentication Failure talking to Ranger KMS

Somes to me you encountered this bug? 
HDFS-10481<https://issues.apache.org/jira/browse/HDFS-10481>
If you’re using CDH, this is fixed in CDH5.5.5, CDH5.7.2 and CDH5.8.2

Wei-Chiu Chuang
A very happy Clouderan

On Oct 11, 2016, at 8:38 AM, Benjamin Ross 
<br...@lattice-engines.com<mailto:br...@lattice-engines.com>> wrote:

All,
I'm trying to use httpfs to write to an encryption zone with security off.  I 
can read from an encryption zone, but I can't write to one.

Here's the applicable namenode logs.  httpfs and root both have all possible 
privileges in the KMS.  What am I missing?


2016-10-07 15:48:16,164 DEBUG ipc.Server 
(Server.java:authorizeConnection(2095)) - Successfully authorized userInfo {
  effectiveUser: "root"
  realUser: "httpfs"
}
protocol: "org.apache.hadoop.hdfs.protocol.ClientProtocol"

2016-10-07 15:48:16,164 DEBUG ipc.Server (Server.java:processOneRpc(1902)) -  
got #2
2016-10-07 15:48:16,164 DEBUG ipc.Server (Server.java:run(2179)) - IPC Server 
handler 9 on 8020: org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
10.41.1.64:47622 Call#2 Retry#0 for RpcKind RPC_PROTOCOL_BUFFER
2016-10-07 15:48:16,165 DEBUG security.UserGroupInformation 
(UserGroupInformation.java:logPrivilegedAction(1751)) - PrivilegedAction 
as:root (auth:PROXY) via httpfs (auth:SIMPLE) 
from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2205)
2016-10-07 15:48:16,166 DEBUG hdfs.StateChange 
(NameNodeRpcServer.java:create(699)) - *DIR* NameNode.create: file 
/tmp/cryptotest/hairyballs for DFSClient_NONMAPREDUCE_-1005188439_28 at 
10.41.1.64
2016-10-07 15:48:16,166 DEBUG hdfs.StateChange 
(FSNamesystem.java:startFileInt(2411)) - DIR* NameSystem.startFile: 
src=/tmp/cryptotest/hairyballs, holder=DFSClient_NONMAPREDUCE_-1005188439_28, 
clientMachine=10.41.1.64, createParent=true, replication=3, createFlag=[CREATE
, OVERWRITE], blockSize=134217728, 
supportedVersions=[CryptoProtocolVersion{description='Encryption zones', 
version=2, unknownValue=null}]
2016-10-07 15:48:16,167 DEBUG security.UserGroupInformation 
(UserGroupInformation.java:logPrivilegedAction(1751)) - PrivilegedAction 
as:hdfs (auth:SIMPLE) 
from:org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:484)
2016-10-07 15:48:16,171 DEBUG client.KerberosAuthenticator 
(KerberosAuthenticator.java:authenticate(205)) - Using fallback authenticator 
sequence.
2016-10-07 15:48:16,176 DEBUG security.UserGroupInformation 
(UserGroupInformation.java:doAs(1728)) - PrivilegedActionException as:hdfs 
(auth:SIMPLE) 
cause:org.apache.hadoop.security.authentication.client.AuthenticationException: 
Authentication failed, status: 403, messag
e: Forbidden
2016-10-07 15:48:16,176 DEBUG ipc.Server (ProtobufRpcEngine.java:call(631)) - 
Served: create queueTime= 2 procesingTime= 10 exception= IOException
2016-10-07 15:48:16,177 DEBUG security.UserGroupInformation 
(UserGroupInformation.java:doAs(1728)) - PrivilegedActionException as:root 
(auth:PROXY) via httpfs (auth:SIMPLE) cause:java.io.IOException: 
java.util.concurrent.ExecutionException: java.io.IOException: org.apach
e.hadoop.security.authentication.client.AuthenticationException: Authentication 
failed, status: 403, message: Forbidden
2016-10-07 15:48:16,177 INFO  ipc.Server (Server.java:logException(2299)) - IPC 
Server handler 9 on 8020, call 
org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 10.41.1.64:47622 
Call#2 Retry#0
java.io.IOException: java.util.concurrent.ExecutionException: 
java.io.IOException: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
Authentication failed, status: 403, message: Forbidden
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.generateEncryptedKey(KMSClientProvider.java:750)
at 
org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:371)
at 
org.apache.hadoop.hdfs.server.namenode

RE: Authentication Failure talking to Ranger KMS

2016-10-11 Thread Benjamin Ross
That seems promising.  But shouldn't I be able to work around it by just 
ensuring that httpfs has all necessary privileges in the KMS service under 
Ranger?

Thanks,
Ben



From: Wei-Chiu Chuang [weic...@cloudera.com]
Sent: Tuesday, October 11, 2016 11:57 AM
To: Benjamin Ross
Cc: user@hadoop.apache.org; u...@ranger.incubator.apache.org
Subject: Re: Authentication Failure talking to Ranger KMS

Somes to me you encountered this bug? 
HDFS-10481<https://issues.apache.org/jira/browse/HDFS-10481>
If you’re using CDH, this is fixed in CDH5.5.5, CDH5.7.2 and CDH5.8.2

Wei-Chiu Chuang
A very happy Clouderan

On Oct 11, 2016, at 8:38 AM, Benjamin Ross 
<br...@lattice-engines.com<mailto:br...@lattice-engines.com>> wrote:

All,
I'm trying to use httpfs to write to an encryption zone with security off.  I 
can read from an encryption zone, but I can't write to one.

Here's the applicable namenode logs.  httpfs and root both have all possible 
privileges in the KMS.  What am I missing?


2016-10-07 15:48:16,164 DEBUG ipc.Server 
(Server.java:authorizeConnection(2095)) - Successfully authorized userInfo {
  effectiveUser: "root"
  realUser: "httpfs"
}
protocol: "org.apache.hadoop.hdfs.protocol.ClientProtocol"

2016-10-07 15:48:16,164 DEBUG ipc.Server (Server.java:processOneRpc(1902)) -  
got #2
2016-10-07 15:48:16,164 DEBUG ipc.Server (Server.java:run(2179)) - IPC Server 
handler 9 on 8020: org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
10.41.1.64:47622 Call#2 Retry#0 for RpcKind RPC_PROTOCOL_BUFFER
2016-10-07 15:48:16,165 DEBUG security.UserGroupInformation 
(UserGroupInformation.java:logPrivilegedAction(1751)) - PrivilegedAction 
as:root (auth:PROXY) via httpfs (auth:SIMPLE) 
from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2205)
2016-10-07 15:48:16,166 DEBUG hdfs.StateChange 
(NameNodeRpcServer.java:create(699)) - *DIR* NameNode.create: file 
/tmp/cryptotest/hairyballs for DFSClient_NONMAPREDUCE_-1005188439_28 at 
10.41.1.64
2016-10-07 15:48:16,166 DEBUG hdfs.StateChange 
(FSNamesystem.java:startFileInt(2411)) - DIR* NameSystem.startFile: 
src=/tmp/cryptotest/hairyballs, holder=DFSClient_NONMAPREDUCE_-1005188439_28, 
clientMachine=10.41.1.64, createParent=true, replication=3, createFlag=[CREATE
, OVERWRITE], blockSize=134217728, 
supportedVersions=[CryptoProtocolVersion{description='Encryption zones', 
version=2, unknownValue=null}]
2016-10-07 15:48:16,167 DEBUG security.UserGroupInformation 
(UserGroupInformation.java:logPrivilegedAction(1751)) - PrivilegedAction 
as:hdfs (auth:SIMPLE) 
from:org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:484)
2016-10-07 15:48:16,171 DEBUG client.KerberosAuthenticator 
(KerberosAuthenticator.java:authenticate(205)) - Using fallback authenticator 
sequence.
2016-10-07 15:48:16,176 DEBUG security.UserGroupInformation 
(UserGroupInformation.java:doAs(1728)) - PrivilegedActionException as:hdfs 
(auth:SIMPLE) 
cause:org.apache.hadoop.security.authentication.client.AuthenticationException: 
Authentication failed, status: 403, messag
e: Forbidden
2016-10-07 15:48:16,176 DEBUG ipc.Server (ProtobufRpcEngine.java:call(631)) - 
Served: create queueTime= 2 procesingTime= 10 exception= IOException
2016-10-07 15:48:16,177 DEBUG security.UserGroupInformation 
(UserGroupInformation.java:doAs(1728)) - PrivilegedActionException as:root 
(auth:PROXY) via httpfs (auth:SIMPLE) cause:java.io.IOException: 
java.util.concurrent.ExecutionException: java.io.IOException: org.apach
e.hadoop.security.authentication.client.AuthenticationException: Authentication 
failed, status: 403, message: Forbidden
2016-10-07 15:48:16,177 INFO  ipc.Server (Server.java:logException(2299)) - IPC 
Server handler 9 on 8020, call 
org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 10.41.1.64:47622 
Call#2 Retry#0
java.io.IOException: java.util.concurrent.ExecutionException: 
java.io.IOException: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
Authentication failed, status: 403, message: Forbidden
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.generateEncryptedKey(KMSClientProvider.java:750)
at 
org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:371)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.generateEncryptedDataEncryptionKey(FSNamesystem.java:2352)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2478)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2377)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:716)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:405)
at 
org.apache.hadoop.hdfs.protocol.proto.Cli

Authentication Failure talking to Ranger KMS

2016-10-11 Thread Benjamin Ross
All,
I'm trying to use httpfs to write to an encryption zone with security off.  I 
can read from an encryption zone, but I can't write to one.

Here's the applicable namenode logs.  httpfs and root both have all possible 
privileges in the KMS.  What am I missing?


2016-10-07 15:48:16,164 DEBUG ipc.Server 
(Server.java:authorizeConnection(2095)) - Successfully authorized userInfo {
  effectiveUser: "root"
  realUser: "httpfs"
}
protocol: "org.apache.hadoop.hdfs.protocol.ClientProtocol"

2016-10-07 15:48:16,164 DEBUG ipc.Server (Server.java:processOneRpc(1902)) -  
got #2
2016-10-07 15:48:16,164 DEBUG ipc.Server (Server.java:run(2179)) - IPC Server 
handler 9 on 8020: org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 
10.41.1.64:47622 Call#2 Retry#0 for RpcKind RPC_PROTOCOL_BUFFER
2016-10-07 15:48:16,165 DEBUG security.UserGroupInformation 
(UserGroupInformation.java:logPrivilegedAction(1751)) - PrivilegedAction 
as:root (auth:PROXY) via httpfs (auth:SIMPLE) 
from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2205)
2016-10-07 15:48:16,166 DEBUG hdfs.StateChange 
(NameNodeRpcServer.java:create(699)) - *DIR* NameNode.create: file 
/tmp/cryptotest/hairyballs for DFSClient_NONMAPREDUCE_-1005188439_28 at 
10.41.1.64
2016-10-07 15:48:16,166 DEBUG hdfs.StateChange 
(FSNamesystem.java:startFileInt(2411)) - DIR* NameSystem.startFile: 
src=/tmp/cryptotest/hairyballs, holder=DFSClient_NONMAPREDUCE_-1005188439_28, 
clientMachine=10.41.1.64, createParent=true, replication=3, createFlag=[CREATE
, OVERWRITE], blockSize=134217728, 
supportedVersions=[CryptoProtocolVersion{description='Encryption zones', 
version=2, unknownValue=null}]
2016-10-07 15:48:16,167 DEBUG security.UserGroupInformation 
(UserGroupInformation.java:logPrivilegedAction(1751)) - PrivilegedAction 
as:hdfs (auth:SIMPLE) 
from:org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:484)
2016-10-07 15:48:16,171 DEBUG client.KerberosAuthenticator 
(KerberosAuthenticator.java:authenticate(205)) - Using fallback authenticator 
sequence.
2016-10-07 15:48:16,176 DEBUG security.UserGroupInformation 
(UserGroupInformation.java:doAs(1728)) - PrivilegedActionException as:hdfs 
(auth:SIMPLE) 
cause:org.apache.hadoop.security.authentication.client.AuthenticationException: 
Authentication failed, status: 403, messag
e: Forbidden
2016-10-07 15:48:16,176 DEBUG ipc.Server (ProtobufRpcEngine.java:call(631)) - 
Served: create queueTime= 2 procesingTime= 10 exception= IOException
2016-10-07 15:48:16,177 DEBUG security.UserGroupInformation 
(UserGroupInformation.java:doAs(1728)) - PrivilegedActionException as:root 
(auth:PROXY) via httpfs (auth:SIMPLE) cause:java.io.IOException: 
java.util.concurrent.ExecutionException: java.io.IOException: org.apach
e.hadoop.security.authentication.client.AuthenticationException: Authentication 
failed, status: 403, message: Forbidden
2016-10-07 15:48:16,177 INFO  ipc.Server (Server.java:logException(2299)) - IPC 
Server handler 9 on 8020, call 
org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 10.41.1.64:47622 
Call#2 Retry#0
java.io.IOException: java.util.concurrent.ExecutionException: 
java.io.IOException: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
Authentication failed, status: 403, message: Forbidden
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.generateEncryptedKey(KMSClientProvider.java:750)
at 
org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:371)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.generateEncryptedDataEncryptionKey(FSNamesystem.java:2352)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2478)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2377)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:716)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:405)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2211)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2207)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2205)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: 

Help with WebHDFS authentication: simple vs simple-dt

2016-09-27 Thread Benjamin Ross
All,
I'm in the process of setting up encryption at rest on a cluster, but I want to 
make sure that everything else remains permissive - otherwise it will break 
existing processes that we have in place.  I'm very close to getting this 
working - the last piece is that webhdfs is not permissive:

In my local setup where I have things working, webhdfs reports the following 
when trying to create a file (note t=simple):
$ curl -i -X PUT 
'localhost:50070/webhdfs/v1/tmp/foo?op=CREATE=true=yarn'
HTTP/1.1 307 TEMPORARY_REDIRECT
Cache-Control: no-cache
Expires: Tue, 27 Sep 2016 14:52:06 GMT
Date: Tue, 27 Sep 2016 14:52:06 GMT
Pragma: no-cache
Expires: Tue, 27 Sep 2016 14:52:06 GMT
Date: Tue, 27 Sep 2016 14:52:06 GMT
Pragma: no-cache
Content-Type: application/octet-stream
Set-Cookie: 
hadoop.auth="u=yarn=yarn=simple=1475023926231=0wqlgqLNm50k/mN66qZwyCb4xUs=";
 Path=/; HttpOnly
Location: 
http://localhost:50075/webhdfs/v1/tmp/foo?op=CREATE=yarn=localhost:9000==true=true
Content-Length: 0
Server: Jetty(6.1.26.hwx)


On the cluster, however, it reports the following (note t=simple-dt)
$ curl -i -X PUT 
'http://10.41.1.6:14000/webhdfs/v1/tmp/foo?op=CREATE=true=yarn'
HTTP/1.1 307 Temporary Redirect
Server: Apache-Coyote/1.1
Set-Cookie: 
hadoop.auth="u=yarn=yarn=simple-dt=1475023818932=9FteGx9VW06bh5dD1L9J+1ENWtY=";
 Path=/; HttpOnly
Location: 
http://10.41.1.6:14000/webhdfs/v1/tmp/foo?op=CREATE=yarn=true=true
Content-Type: application/json
Content-Length: 0
Date: Tue, 27 Sep 2016 14:50:18 GMT


Note that my local setup reports the authentication type as simple whereas the 
cluster reports simple-dt.  This is the reason why I'm getting an 
authentication failure when trying to write a file to the cluster.  I don't 
want Keberos or delegation tokens enabled.

Does anyone know what I need to change so that this becomes simple again?

Thanks in advance,
Ben


This message has been scanned for malware by Websense. www.websense.com


RE: Help starting RangerKMS

2016-08-26 Thread Benjamin Ross
I actually figured this out.  It was pretty simple - the issue was that the 
ranger KMS master key was generated when the system didn't have JCE installed.  
So I had to delete the master key and let the system regenerate it.



From: Benjamin Ross
Sent: Friday, August 26, 2016 9:49 AM
To: user@hadoop.apache.org
Subject: Help starting RangerKMS

Hey guys,
I'm trying to start the RangerKMS server and I'm running into this very obscure 
error.  Any help would be appreciated.  We have confirmed JCE is installed on 
the node running RangerKMS.  We're using Java JDK 1.7 and Ranger 0.5.0.2.3 (HDP 
2.3.6.0-3796).


[root@bodcdevhdp6 kms]# cat catalina.out
Aug 25, 2016 5:19:00 PM org.apache.ranger.server.tomcat.EmbeddedServer start
INFO: Webapp file =./webapp, webAppName = /kms
Aug 25, 2016 5:19:00 PM org.apache.ranger.server.tomcat.EmbeddedServer start
INFO: Adding webapp [/kms] = path [./webapp] .
Aug 25, 2016 5:19:00 PM org.apache.ranger.server.tomcat.EmbeddedServer start
INFO: Finished init of webapp [/kms] = path [./webapp].
log4j:WARN No appenders could be found for logger 
(org.apache.catalina.loader.WebappClassLoaderBase).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
javax.crypto.BadPaddingException: Given final block not properly padded
at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:811)
at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:676)
at com.sun.crypto.provider.PBECipherCore.doFinal(PBECipherCore.java:422)
at 
com.sun.crypto.provider.PBEWithMD5AndTripleDESCipher.engineDoFinal(PBEWithMD5AndTripleDESCipher.java:326)
at javax.crypto.Cipher.doFinal(Cipher.java:2087)
at 
org.apache.hadoop.crypto.key.RangerMasterKey.decryptKey(RangerMasterKey.java:192)
at 
org.apache.hadoop.crypto.key.RangerMasterKey.decryptMasterKey(RangerMasterKey.java:100)
at 
org.apache.hadoop.crypto.key.RangerMasterKey.getMasterKey(RangerMasterKey.java:72)
at 
org.apache.hadoop.crypto.key.RangerKeyStoreProvider.(RangerKeyStoreProvider.java:93)
at 
org.apache.hadoop.crypto.key.RangerKeyStoreProvider$Factory.createProvider(RangerKeyStoreProvider.java:386)
at 
org.apache.hadoop.crypto.key.KeyProviderFactory.get(KeyProviderFactory.java:95)
at 
org.apache.hadoop.crypto.key.kms.server.KMSWebApp.contextInitialized(KMSWebApp.java:176)
at 
org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5068)
at 
org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5584)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:147)
at 
org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1572)
at 
org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1562)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

ERROR: Hadoop KMS could not be started

REASON: java.lang.NullPointerException

Stacktrace:
---
java.lang.NullPointerException
at 
org.apache.hadoop.crypto.key.kms.server.KMSWebApp.contextInitialized(KMSWebApp.java:178)
at 
org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5068)
at 
org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5584)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:147)
at 
org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1572)
at 
org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1562)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
---


This message has been scanned for malware by Websense. www.websense.com


Help starting RangerKMS

2016-08-26 Thread Benjamin Ross
Hey guys,
I'm trying to start the RangerKMS server and I'm running into this very obscure 
error.  Any help would be appreciated.  We have confirmed JCE is installed on 
the node running RangerKMS.  We're using Java JDK 1.7 and Ranger 0.5.0.2.3 (HDP 
2.3.6.0-3796).


[root@bodcdevhdp6 kms]# cat catalina.out
Aug 25, 2016 5:19:00 PM org.apache.ranger.server.tomcat.EmbeddedServer start
INFO: Webapp file =./webapp, webAppName = /kms
Aug 25, 2016 5:19:00 PM org.apache.ranger.server.tomcat.EmbeddedServer start
INFO: Adding webapp [/kms] = path [./webapp] .
Aug 25, 2016 5:19:00 PM org.apache.ranger.server.tomcat.EmbeddedServer start
INFO: Finished init of webapp [/kms] = path [./webapp].
log4j:WARN No appenders could be found for logger 
(org.apache.catalina.loader.WebappClassLoaderBase).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
javax.crypto.BadPaddingException: Given final block not properly padded
at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:811)
at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:676)
at com.sun.crypto.provider.PBECipherCore.doFinal(PBECipherCore.java:422)
at 
com.sun.crypto.provider.PBEWithMD5AndTripleDESCipher.engineDoFinal(PBEWithMD5AndTripleDESCipher.java:326)
at javax.crypto.Cipher.doFinal(Cipher.java:2087)
at 
org.apache.hadoop.crypto.key.RangerMasterKey.decryptKey(RangerMasterKey.java:192)
at 
org.apache.hadoop.crypto.key.RangerMasterKey.decryptMasterKey(RangerMasterKey.java:100)
at 
org.apache.hadoop.crypto.key.RangerMasterKey.getMasterKey(RangerMasterKey.java:72)
at 
org.apache.hadoop.crypto.key.RangerKeyStoreProvider.(RangerKeyStoreProvider.java:93)
at 
org.apache.hadoop.crypto.key.RangerKeyStoreProvider$Factory.createProvider(RangerKeyStoreProvider.java:386)
at 
org.apache.hadoop.crypto.key.KeyProviderFactory.get(KeyProviderFactory.java:95)
at 
org.apache.hadoop.crypto.key.kms.server.KMSWebApp.contextInitialized(KMSWebApp.java:176)
at 
org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5068)
at 
org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5584)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:147)
at 
org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1572)
at 
org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1562)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

ERROR: Hadoop KMS could not be started

REASON: java.lang.NullPointerException

Stacktrace:
---
java.lang.NullPointerException
at 
org.apache.hadoop.crypto.key.kms.server.KMSWebApp.contextInitialized(KMSWebApp.java:178)
at 
org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:5068)
at 
org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5584)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:147)
at 
org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1572)
at 
org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1562)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
---


This message has been scanned for malware by Websense. www.websense.com


RE: Issue with Hadoop Job History Server

2016-08-18 Thread Benjamin Ross
Turns out we made a stupid mistake - our system was managing to mix 
configuration between an old cluster and a new cluster.  So, things are working 
now.

Thanks,
Ben

From: Benjamin Ross
Sent: Thursday, August 18, 2016 10:05 AM
To: Rohith Sharma K S; Gao, Yunlong
Cc: user@hadoop.apache.org
Subject: RE: Issue with Hadoop Job History Server

Rohith,
Thanks - we're still having issues.  Can you help out with this?

How do you specify the done directory for an MR job?  The job history done dir 
is mapreduce.jobhistory.done-dir.  I specified the job one as 
mapreduce.jobtracker.jobhistory.location as per the documentation here.
https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

They're both set to the same thing.  I did a recursive ls on hadoop and it 
doesn't seem like there are any directories called "done" with recent data in 
them.  All of the data in /mr-history is old.  Here's a summary of that ls:

drwx--   - yarn  hadoop  0 2016-07-14 16:39 /ats/done
drwxr-xr-x   - yarn  hadoop  0 2016-07-14 16:39 
/ats/done/1468528507723
drwxr-xr-x   - yarn  hadoop  0 2016-07-14 16:39 
/ats/done/1468528507723/
drwxr-xr-x   - yarn  hadoop  0 2016-07-25 20:10 
/ats/done/1468528507723//000
drwxrwxrwx   - mapredhadoop  0 2016-07-19 14:47 /mr-history/done
drwxrwx---   - mapredhadoop  0 2016-07-19 14:47 
/mr-history/done/2016
drwxrwx---   - mapredhadoop  0 2016-07-19 14:47 
/mr-history/done/2016/07
drwxrwx---   - mapredhadoop  0 2016-07-27 13:49 
/mr-history/done/2016/07/19
drwxrwxrwt   - bross hdfs0 2016-08-15 22:39 
/tmp/hadoop-yarn/staging/history/done_intermediate
   => lots of recent data in 
/tmp/hadoop-yarn/staging/history/done_intermediate

Here's our mapred-site.xml:

  


  mapreduce.admin.map.child.java.opts
  -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true 
-Dhdp.version=2.3.6.0-3796



  mapreduce.admin.reduce.child.java.opts
  -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true 
-Dhdp.version=2.3.6.0-3796



  mapreduce.admin.user.env
  
LD_LIBRARY_PATH=/usr/hdp/2.3.6.0-3796/hadoop/lib/native:/usr/hdp/2.3.6.0-3796/hadoop/lib/native/Linux-amd64-64



  mapreduce.am.max-attempts
  2



  mapreduce.application.classpath
  
$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.3.6.0-3796/hadoop/lib/hadoop-lzo-0.6.0.2.3.6.0-3796.jar:/etc/hadoop/conf/secure



  mapreduce.application.framework.path
  
/hdp/apps/2.3.6.0-3796/mapreduce/mapreduce.tar.gz#mr-framework



  mapreduce.cluster.administrators
   hadoop



  mapreduce.framework.name
  yarn



  mapreduce.job.counters.max
  130



  mapreduce.job.emit-timeline-data
  false



  mapreduce.job.reduce.slowstart.completedmaps
  0.05



  mapreduce.job.user.classpath.first
  true



  mapreduce.jobhistory.address
  bodcdevhdp6.dev.lattice.local:10020



  mapreduce.jobhistory.bind-host
  0.0.0.0



  mapreduce.jobhistory.done-dir
  /mr-history/done



  mapreduce.jobhistory.intermediate-done-dir
  /mr-history/tmp



  mapreduce.jobhistory.recovery.enable
  true



  mapreduce.jobhistory.recovery.store.class
  
org.apache.hadoop.mapreduce.v2.hs.HistoryServerLeveldbStateStoreService



  mapreduce.jobhistory.recovery.store.leveldb.path
  /hadoop/mapreduce/jhs



  mapreduce.jobhistory.webapp.address
  bodcdevhdp6.dev.lattice.local:19888



  mapreduce.jobtracker.jobhistory.completed.location
  /mr-history/done



  mapreduce.map.java.opts
  -Xmx4915m



  mapreduce.map.log.level
  INFO



  mapreduce.map.memory.mb
  6144



  mapreduce.map.output.compress
  false



  mapreduce.map.sort.spill.percent
  0.7



  mapreduce.map.speculative
  false



  mapreduce.output.fileoutputformat.compress
  false



  mapreduce.output.fileoutputformat.compress.type
  BLOCK



  mapreduce.reduce.input.buffer.percent
  0.0



  mapreduce.reduce.java.opts
  -Xmx9830m



  mapreduce.reduce.log.lev

RE: Issue with Hadoop Job History Server

2016-08-18 Thread Benjamin Ross
ent
  0.7



  mapreduce.reduce.shuffle.merge.percent
  0.66



  mapreduce.reduce.shuffle.parallelcopies
  30



  mapreduce.reduce.speculative
  false



  mapreduce.shuffle.port
  13562



  mapreduce.task.io.sort.factor
  100



  mapreduce.task.io.sort.mb
  2047



  mapreduce.task.timeout
  30



  yarn.app.mapreduce.am.admin-command-opts
  -Dhdp.version=2.3.6.0-3796



  yarn.app.mapreduce.am.command-opts
  -Xmx4915m -Dhdp.version=${hdp.version}



  yarn.app.mapreduce.am.log.level
  INFO



  yarn.app.mapreduce.am.resource.mb
  6144



  yarn.app.mapreduce.am.staging-dir
  /user


  

Thanks,
Ben


From: Rohith Sharma K S [ksrohithsha...@gmail.com]
Sent: Thursday, August 18, 2016 3:17 AM
To: Gao, Yunlong
Cc: user@hadoop.apache.org; Benjamin Ross
Subject: Re: Issue with Hadoop Job History Server

MR jobs and JHS should have same configurations for done-dir if configured. 
Otherwise staging-dir should be same for both. Make sure both Job and JHS has 
same configurations value.

Usually what would happen is , MRApp writes job file in one location and 
HistoryServer trying to read from different location. This causes, JHS to 
display empty jobs.

Thanks & Regards
Rohith Sharma K S

On Aug 18, 2016, at 12:35 PM, Gao, Yunlong 
<dg.gaoyunl...@gmail.com<mailto:dg.gaoyunl...@gmail.com>> wrote:

To whom it may concern,

I am using Hadoop 2.7.1.2.3.6.0-3796, with the Hortonworks distribution of 
HDP-2.3.6.0-3796. I have a question with the Hadoop Job History sever.

After I set up everything, the resource manager/name nodes/data nodes seem to 
be running fine. But the job history server is not working correctly.  The 
issue with it is that the UI of the job history server does not show any jobs.  
And all the rest calls to the job history server do not work either. Also 
notice that there is no logs in HDFS under the directory of 
"mapreduce.jobhistory.done-dir"

I have tried with different things, including restarting the job history server 
and monitor the log -- no error/exceptions is observed. I also rename the 
/hadoop/mapreduce/jhs/mr-jhs-state for the state recovery of job history 
server, and then restart it again, but no particular error happens. I tried 
with some other random stuff that I borrowed from online blogs/documents but 
got no luck.


Any help would be very much appreciated.

Thanks,
Yunlong





Click here<https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to report 
this email as spam.


This message has been scanned for malware by Websense. www.websense.com


RE: anyone seen this weird "setXIncludeAware is not supported" error?

2016-07-29 Thread Benjamin Ross
First thought is that I would check to see if you're somehow pulling in a 
xerces library that your version of Hadoop wasn't built against.  Can you 
provide your pom file?  Also, I would do a mvn dependencies:list and see if 
something looks off. You should probably paste the output of that for others on 
this list also.

Ben

From: Frank Luo [j...@merkleinc.com]
Sent: Friday, July 29, 2016 4:07 PM
To: user@hadoop.apache.org
Subject: anyone seen this weird "setXIncludeAware is not supported" error?

Ok, this drives me into nuts.

I got a junit test case as simple as below:

  @Before
public void setup() throws IOException {
Job job = Job.getInstance();
Configuration config = job.getConfiguration();

And I got Exception at Job.getInstance() as:

  java.lang.UnsupportedOperationException:  setXIncludeAware is not supported 
on this JAXP implementation or earlier: class 
org.apache.xerces.jaxp.DocumentBuilderFactoryImpl

  at 
javax.xml.parsers.DocumentBuilderFactory.setXIncludeAware(DocumentBuilderFactory.java:614)

at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2523)

at 
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492)

at 
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405)

at org.apache.hadoop.conf.Configuration.get(Configuration.java:981)

at 
org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:2069)

at org.apache.hadoop.mapred.JobConf.(JobConf.java:447)

at org.apache.hadoop.mapreduce.Job.getInstance(Job.java:175)

at org.apache.hadoop.mapreduce.Job.getInstance(Job.java:156)

at 
com.merkleinc.crkb.match.keymatching.KeyMatchingTest.setup(KeyMatchingTest.java:52)

What is strange is that the same code works in windows but not on Linux. Even 
on Linux, only one class is facing the problem. Other classes are fine running 
the exact same code.  On that problematic class, there are four test methods, 
one is successful and three are failing.

Anyone has similar experience?


I have Hadoop 2.7.1, hive 1.2.1 and hbase 1.1.2. and I am running with maven 
3.3.3 and/or 3.3.9.

Access the Q2 2016 Digital Marketing Report for a fresh set of trends and 
benchmarks in digital 
marketing

Download the latest installment of our annual Marketing Imperatives, “Winning 
with People-Based 
Marketing”

This email and any attachments transmitted with it are intended for use by the 
intended recipient(s) only. If you have received this email in error, please 
notify the sender immediately and then delete it. If you are not the intended 
recipient, you must not keep, use, disclose, copy or distribute this email 
without the author’s prior permission. We take precautions to minimize the risk 
of transmitting software viruses, but we advise you to perform your own virus 
checks on any attachment to this message. We cannot accept liability for any 
loss or damage caused by software viruses. The information contained in this 
communication may be confidential and may be subject to the attorney-client 
privilege.



Click here to report 
this email as spam.



This message has been scanned for malware by Websense. www.websense.com

-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org



RE: Question regarding WebHDFS security

2016-07-05 Thread Benjamin Ross
Thanks Larry.  I'll need to look into the details quite a bit further, but I 
take it that I can define some mapping such that requests for particular file 
paths will trigger particular credentials to be used (until everything's 
upgraded)?  Currently all requests come in using permissive auth with username 
yarn.  Once we enable Kerberos, I'd optimally like for that to translate to use 
some set of Kerberos credentials if the path is /foo and some other set of 
credentials if the path is /bar.  This will only be temporary until things are 
fully upgraded.

Appreciate the help.
Ben



From: Larry McCay [lmc...@hortonworks.com]
Sent: Tuesday, July 05, 2016 4:23 PM
To: Benjamin Ross
Cc: David Morel; user@hadoop.apache.org
Subject: Re: Question regarding WebHDFS security

For consuming REST APIs like webhdfs, where kerberos is inconvenient or 
impossible, you may want to consider using a trusted proxy like Apache Knox.
It will authenticate as knox to the backend services and act on behalf of your 
custom services.
It will also allow you to authenticate to Knox from the services using a number 
of different mechanisms.

http://knox.apache.org

On Jul 5, 2016, at 2:43 PM, Benjamin Ross 
<br...@lattice-engines.com<mailto:br...@lattice-engines.com>> wrote:

Hey David,
Thanks.  Yep - that's the easy part.  Let me clarify.

Consider that we have:
1. A Hadoop cluster running without Kerberos
2. A number of services contacting that hadoop cluster and retrieving data from 
it using WebHDFS.

Clearly the services don't need to login to WebHDFS using credentials because 
the cluster isn't kerberized just yet.

Now what happens when we enable Kerberos on the cluster?  We still need to 
allow those services to contact the cluster without credentials until we can 
upgrade them.  Otherwise we'll have downtime.  So what can we do?

As a possible solution, is there any way to allow unprotected access from just 
those machines until we can upgrade them?

Thanks,
Ben






From: David Morel [dmo...@amakuru.net<mailto:dmo...@amakuru.net>]
Sent: Tuesday, July 05, 2016 2:33 PM
To: Benjamin Ross
Cc: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: Question regarding WebHDFS security


Le 5 juil. 2016 7:42 PM, "Benjamin Ross" 
<br...@lattice-engines.com<mailto:br...@lattice-engines.com>> a écrit :
>
> All,
> We're planning the rollout of kerberizing our hadoop cluster.  The issue is 
> that we have several single tenant services that rely on contacting the HDFS 
> cluster over WebHDFS without credentials.  So, the concern is that once we 
> kerberize the cluster, we will no longer be able to access it without 
> credentials from these single-tenant systems, which results in a painful 
> upgrade dependency.
>
> Any suggestions for dealing with this problem in a simple way?
>
> If not, any suggestion for a better forum to ask this question?
>
> Thanks in advance,
> Ben

It's usually not super-hard to wrap your http calls with a module that handles 
Kerberos, depending on what language you use. For instance 
https://metacpan.org/pod/Net::Hadoop::WebHDFS::LWP does this.

David



Click here<https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to report 
this email as spam.



This message has been scanned for malware by Websense. 
www.websense.com<http://www.websense.com/>



RE: Question regarding WebHDFS security

2016-07-05 Thread Benjamin Ross
Hey David,
Thanks.  Yep - that's the easy part.  Let me clarify.

Consider that we have:
1. A Hadoop cluster running without Kerberos
2. A number of services contacting that hadoop cluster and retrieving data from 
it using WebHDFS.

Clearly the services don't need to login to WebHDFS using credentials because 
the cluster isn't kerberized just yet.

Now what happens when we enable Kerberos on the cluster?  We still need to 
allow those services to contact the cluster without credentials until we can 
upgrade them.  Otherwise we'll have downtime.  So what can we do?

As a possible solution, is there any way to allow unprotected access from just 
those machines until we can upgrade them?

Thanks,
Ben






From: David Morel [dmo...@amakuru.net]
Sent: Tuesday, July 05, 2016 2:33 PM
To: Benjamin Ross
Cc: user@hadoop.apache.org
Subject: Re: Question regarding WebHDFS security


Le 5 juil. 2016 7:42 PM, "Benjamin Ross" 
<br...@lattice-engines.com<mailto:br...@lattice-engines.com>> a écrit :
>
> All,
> We're planning the rollout of kerberizing our hadoop cluster.  The issue is 
> that we have several single tenant services that rely on contacting the HDFS 
> cluster over WebHDFS without credentials.  So, the concern is that once we 
> kerberize the cluster, we will no longer be able to access it without 
> credentials from these single-tenant systems, which results in a painful 
> upgrade dependency.
>
> Any suggestions for dealing with this problem in a simple way?
>
> If not, any suggestion for a better forum to ask this question?
>
> Thanks in advance,
> Ben

It's usually not super-hard to wrap your http calls with a module that handles 
Kerberos, depending on what language you use. For instance 
https://metacpan.org/pod/Net::Hadoop::WebHDFS::LWP does this.

David



Click here<https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to report 
this email as spam.


This message has been scanned for malware by Websense. www.websense.com


Question regarding WebHDFS security

2016-07-05 Thread Benjamin Ross
All,
We're planning the rollout of kerberizing our hadoop cluster.  The issue is 
that we have several single tenant services that rely on contacting the HDFS 
cluster over WebHDFS without credentials.  So, the concern is that once we 
kerberize the cluster, we will no longer be able to access it without 
credentials from these single-tenant systems, which results in a painful 
upgrade dependency.

Any suggestions for dealing with this problem in a simple way?

If not, any suggestion for a better forum to ask this question?

Thanks in advance,
Ben


This message has been scanned for malware by Websense. www.websense.com