[
https://issues.apache.org/jira/browse/OOZIE-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Olson updated OOZIE-3723:
--------------------------------
Description:
We recently have encountered two separate issues that both required an Oozie
service restart to resolve. In both situations it was apparent that incorrect
workflow-supplied configuration properties related to remote FileSystem
connectivity to support obtaining HDFS credentials for remote clusters (via
{{{}mapreduce.job.hdfs-servers{}}}) are being retained permanently within some
kind of cache in the Oozie service or underlying Hadoop code. These cached
values are superseding corrected values after the workflow configuration is
fixed, giving us no known way to fix the problem without restarting the Oozie
service. We confirmed that the {{hdfs-site.xml}} and {{oozie-site.xml}} files
where Oozie is running had not been updated since the prior restart, so not a
basic case of stale configuration.
We are running Oozie version 5.2.0 in this environment.
Complete stack traces are provided below.
Issue 1:
A workflow incorrectly set {{dfs.namenode.kerberos.principal.pattern}} to
{{*/*@*}} but our system default is*.
{noformat}
org.apache.oozie.action.ActionExecutorException: JA009: Couldn't set up IO
streams: java.lang.IllegalArgumentException: Server has invalid Kerberos
principal: nn/[email protected], doesn't match the
pattern: '*/*@*'
at
org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
at
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1134)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1644)
at
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
at
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
at org.apache.oozie.command.XCommand.call(XCommand.java:290)
at
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:363)
at
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Couldn't set up IO streams:
java.lang.IllegalArgumentException: Server has invalid Kerberos principal:
nn/[email protected], doesn't match the pattern:
'*/*@*'
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:894)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
at org.apache.hadoop.ipc.Client.call(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1455)
at
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
at
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
at com.sun.proxy.$Proxy35.getDelegationToken(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1134)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy36.getDelegationToken(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:734)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1996)
at
org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer.java:95)
at
org.apache.hadoop.security.token.DelegationTokenIssuer.addDelegationTokens(DelegationTokenIssuer.java:76)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:143)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:102)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)
at
org.apache.oozie.action.hadoop.HDFSCredentials$1.run(HDFSCredentials.java:103)
at
org.apache.oozie.action.hadoop.HDFSCredentials$1.run(HDFSCredentials.java:100)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at
org.apache.oozie.action.hadoop.HDFSCredentials.obtainTokensForNamenodes(HDFSCredentials.java:99)
at
org.apache.oozie.action.hadoop.HDFSCredentials.updateCredentials(HDFSCredentials.java:65)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.setCredentialTokens(JavaActionExecutor.java:1546)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1082)
... 11 more
Caused by: java.lang.IllegalArgumentException: Server has invalid Kerberos
principal: nn/[email protected], doesn't match the
pattern: '*/*@*'
at
org.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:319)
at
org.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:240)
at
org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:166)
at
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:392)
at
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:623)
at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:414)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:843)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:839)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:839)
... 44 more
{noformat}
Issue 2:
A workflow incorrectly got FQDNs mixed up, setting
{{dfs.namenode.rpc-address.cluster.nn1}} = {{hostname.another.domain.com:8020}}
instead of {{{}hostname.some.domain.com:8020{}}}.
{noformat}
org.apache.oozie.action.ActionExecutorException: JA001: Invalid host name:
local host is: "hostname.some.domain.com/10.1.2.3"; destination host
is: "hostname.another.domain.com":8020;
java.net.UnknownHostException; For more details see:
http://wiki.apache.org/hadoop/UnknownHost
at
org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
at
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1134)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1644)
at
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
at
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
at org.apache.oozie.command.XCommand.call(XCommand.java:290)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.net.UnknownHostException: Invalid host name: local host is:
"hostname.some.domain.com/10.1.2.3"; destination host is:
"hostname.another.domain.com":8020; java.net.UnknownHostException;
For more details see: http://wiki.apache.org/hadoop/UnknownHost
at sun.reflect.GeneratedConstructorAccessor185.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:913)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:841)
at
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:662)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:833)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
at org.apache.hadoop.ipc.Client.call(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1455)
at
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
at
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
at com.sun.proxy.$Proxy35.getDelegationToken(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1134)
at sun.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy36.getDelegationToken(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:734)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1996)
at
org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer.java:95)
at
org.apache.hadoop.security.token.DelegationTokenIssuer.addDelegationTokens(DelegationTokenIssuer.java:76)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:143)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:102)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)
at
org.apache.oozie.action.hadoop.HDFSCredentials$1.run(HDFSCredentials.java:103)
at
org.apache.oozie.action.hadoop.HDFSCredentials$1.run(HDFSCredentials.java:100)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at
org.apache.oozie.action.hadoop.HDFSCredentials.obtainTokensForNamenodes(HDFSCredentials.java:99)
at
org.apache.oozie.action.hadoop.HDFSCredentials.updateCredentials(HDFSCredentials.java:65)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.setCredentialTokens(JavaActionExecutor.java:1546)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1082)
... 9 more
Caused by: java.net.UnknownHostException
at
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:664)
... 43 more
{noformat}
was:
We recently have encountered two separate issues that both required an Oozie
service restart to resolve. In both situations it was apparent that incorrect
workflow-supplied configuration properties related to remote FileSystem
connectivity to support obtaining HDFS credentials for remote clusters (via
{{{}mapreduce.job.hdfs-servers{}}}) are being retained permanently within some
kind of cache in the Oozie service or underlying Hadoop code. These cached
values are superseding corrected values after the workflow configuration is
fixed, giving us no known way to fix the problem without restarting the Oozie
service. We confirmed that the {{hdfs-site.xml}} and {{oozie-site.xml}} files
where Oozie is running had not been updated since the prior restart, so not a
basic case of stale configuration.
We are running Oozie version 5.2.0 in this environment.
Complete stack traces are provided below.
Issue 1:
A workflow incorrectly set {{dfs.namenode.kerberos.principal.pattern}} to
\*/\*@\* but our system default is\*.
{noformat}
org.apache.oozie.action.ActionExecutorException: JA009: Couldn't set up IO
streams: java.lang.IllegalArgumentException: Server has invalid Kerberos
principal: nn/[email protected], doesn't match the
pattern: '*/*@*'
at
org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
at
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1134)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1644)
at
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
at
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
at org.apache.oozie.command.XCommand.call(XCommand.java:290)
at
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:363)
at
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Couldn't set up IO streams:
java.lang.IllegalArgumentException: Server has invalid Kerberos principal:
nn/[email protected], doesn't match the pattern:
'*/*@*'
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:894)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
at org.apache.hadoop.ipc.Client.call(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1455)
at
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
at
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
at com.sun.proxy.$Proxy35.getDelegationToken(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1134)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy36.getDelegationToken(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:734)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1996)
at
org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer.java:95)
at
org.apache.hadoop.security.token.DelegationTokenIssuer.addDelegationTokens(DelegationTokenIssuer.java:76)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:143)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:102)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)
at
org.apache.oozie.action.hadoop.HDFSCredentials$1.run(HDFSCredentials.java:103)
at
org.apache.oozie.action.hadoop.HDFSCredentials$1.run(HDFSCredentials.java:100)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at
org.apache.oozie.action.hadoop.HDFSCredentials.obtainTokensForNamenodes(HDFSCredentials.java:99)
at
org.apache.oozie.action.hadoop.HDFSCredentials.updateCredentials(HDFSCredentials.java:65)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.setCredentialTokens(JavaActionExecutor.java:1546)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1082)
... 11 more
Caused by: java.lang.IllegalArgumentException: Server has invalid Kerberos
principal: nn/[email protected], doesn't match the
pattern: '*/*@*'
at
org.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:319)
at
org.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:240)
at
org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:166)
at
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:392)
at
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:623)
at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:414)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:843)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:839)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:839)
... 44 more
{noformat}
Issue 2:
A workflow incorrectly got FQDNs mixed up, setting
{{dfs.namenode.rpc-address.cluster.nn1}} = {{hostname.another.domain.com:8020}}
instead of {{{}hostname.some.domain.com:8020{}}}.
{noformat}
org.apache.oozie.action.ActionExecutorException: JA001: Invalid host name:
local host is: "hostname.some.domain.com/10.1.2.3"; destination host
is: "hostname.another.domain.com":8020;
java.net.UnknownHostException; For more details see:
http://wiki.apache.org/hadoop/UnknownHost
at
org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
at
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1134)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1644)
at
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
at
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
at org.apache.oozie.command.XCommand.call(XCommand.java:290)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.net.UnknownHostException: Invalid host name: local host is:
"hostname.some.domain.com/10.1.2.3"; destination host is:
"hostname.another.domain.com":8020; java.net.UnknownHostException;
For more details see: http://wiki.apache.org/hadoop/UnknownHost
at sun.reflect.GeneratedConstructorAccessor185.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:913)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:841)
at
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:662)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:833)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
at org.apache.hadoop.ipc.Client.call(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1455)
at
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
at
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
at com.sun.proxy.$Proxy35.getDelegationToken(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1134)
at sun.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy36.getDelegationToken(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:734)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1996)
at
org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer.java:95)
at
org.apache.hadoop.security.token.DelegationTokenIssuer.addDelegationTokens(DelegationTokenIssuer.java:76)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:143)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:102)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)
at
org.apache.oozie.action.hadoop.HDFSCredentials$1.run(HDFSCredentials.java:103)
at
org.apache.oozie.action.hadoop.HDFSCredentials$1.run(HDFSCredentials.java:100)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at
org.apache.oozie.action.hadoop.HDFSCredentials.obtainTokensForNamenodes(HDFSCredentials.java:99)
at
org.apache.oozie.action.hadoop.HDFSCredentials.updateCredentials(HDFSCredentials.java:65)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.setCredentialTokens(JavaActionExecutor.java:1546)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1082)
... 9 more
Caused by: java.net.UnknownHostException
at
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:664)
... 43 more
{noformat}
> Oozie service permanently caches workflow-supplied FileSystem connectivity
> configuration properties for obtaining HDFS Credentials until restarted
> --------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: OOZIE-3723
> URL: https://issues.apache.org/jira/browse/OOZIE-3723
> Project: Oozie
> Issue Type: Bug
> Components: workflow
> Reporter: Andrew Olson
> Priority: Major
>
> We recently have encountered two separate issues that both required an Oozie
> service restart to resolve. In both situations it was apparent that incorrect
> workflow-supplied configuration properties related to remote FileSystem
> connectivity to support obtaining HDFS credentials for remote clusters (via
> {{{}mapreduce.job.hdfs-servers{}}}) are being retained permanently within
> some kind of cache in the Oozie service or underlying Hadoop code. These
> cached values are superseding corrected values after the workflow
> configuration is fixed, giving us no known way to fix the problem without
> restarting the Oozie service. We confirmed that the {{hdfs-site.xml}} and
> {{oozie-site.xml}} files where Oozie is running had not been updated since
> the prior restart, so not a basic case of stale configuration.
> We are running Oozie version 5.2.0 in this environment.
> Complete stack traces are provided below.
> Issue 1:
> A workflow incorrectly set {{dfs.namenode.kerberos.principal.pattern}} to
> {{*/*@*}} but our system default is*.
> {noformat}
> org.apache.oozie.action.ActionExecutorException: JA009: Couldn't set up IO
> streams: java.lang.IllegalArgumentException: Server has invalid Kerberos
> principal: nn/[email protected], doesn't match the
> pattern: '*/*@*'
> at
> org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
> at
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1134)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1644)
> at
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
> at
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
> at org.apache.oozie.command.XCommand.call(XCommand.java:290)
> at
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:363)
> at
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:292)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> Caused by: java.io.IOException: Couldn't set up IO streams:
> java.lang.IllegalArgumentException: Server has invalid Kerberos principal:
> nn/[email protected], doesn't match the pattern:
> '*/*@*'
> at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:894)
> at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
> at org.apache.hadoop.ipc.Client.call(Client.java:1502)
> at org.apache.hadoop.ipc.Client.call(Client.java:1455)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
> at com.sun.proxy.$Proxy35.getDelegationToken(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1134)
> at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy36.getDelegationToken(Unknown Source)
> at
> org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:734)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1996)
> at
> org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer.java:95)
> at
> org.apache.hadoop.security.token.DelegationTokenIssuer.addDelegationTokens(DelegationTokenIssuer.java:76)
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:143)
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:102)
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)
> at
> org.apache.oozie.action.hadoop.HDFSCredentials$1.run(HDFSCredentials.java:103)
> at
> org.apache.oozie.action.hadoop.HDFSCredentials$1.run(HDFSCredentials.java:100)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
> at
> org.apache.oozie.action.hadoop.HDFSCredentials.obtainTokensForNamenodes(HDFSCredentials.java:99)
> at
> org.apache.oozie.action.hadoop.HDFSCredentials.updateCredentials(HDFSCredentials.java:65)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.setCredentialTokens(JavaActionExecutor.java:1546)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1082)
> ... 11 more
> Caused by: java.lang.IllegalArgumentException: Server has invalid Kerberos
> principal: nn/[email protected], doesn't match the
> pattern: '*/*@*'
> at
> org.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:319)
> at
> org.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:240)
> at
> org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:166)
> at
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:392)
> at
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:623)
> at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:414)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:843)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:839)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
> at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:839)
> ... 44 more
> {noformat}
> Issue 2:
> A workflow incorrectly got FQDNs mixed up, setting
> {{dfs.namenode.rpc-address.cluster.nn1}} =
> {{hostname.another.domain.com:8020}} instead of
> {{{}hostname.some.domain.com:8020{}}}.
> {noformat}
> org.apache.oozie.action.ActionExecutorException: JA001: Invalid host name:
> local host is: "hostname.some.domain.com/10.1.2.3"; destination
> host is: "hostname.another.domain.com":8020;
> java.net.UnknownHostException; For more details see:
> http://wiki.apache.org/hadoop/UnknownHost
> at
> org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
> at
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1134)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1644)
> at
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
> at
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
> at org.apache.oozie.command.XCommand.call(XCommand.java:290)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:750)
> Caused by: java.net.UnknownHostException: Invalid host name: local host is:
> "hostname.some.domain.com/10.1.2.3"; destination host is:
> "hostname.another.domain.com":8020; java.net.UnknownHostException;
> For more details see: http://wiki.apache.org/hadoop/UnknownHost
> at sun.reflect.GeneratedConstructorAccessor185.newInstance(Unknown
> Source)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:913)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:841)
> at
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:662)
> at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:833)
> at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
> at org.apache.hadoop.ipc.Client.call(Client.java:1502)
> at org.apache.hadoop.ipc.Client.call(Client.java:1455)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
> at com.sun.proxy.$Proxy35.getDelegationToken(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1134)
> at sun.reflect.GeneratedMethodAccessor59.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy36.getDelegationToken(Unknown Source)
> at
> org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:734)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1996)
> at
> org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer.java:95)
> at
> org.apache.hadoop.security.token.DelegationTokenIssuer.addDelegationTokens(DelegationTokenIssuer.java:76)
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:143)
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:102)
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)
> at
> org.apache.oozie.action.hadoop.HDFSCredentials$1.run(HDFSCredentials.java:103)
> at
> org.apache.oozie.action.hadoop.HDFSCredentials$1.run(HDFSCredentials.java:100)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
> at
> org.apache.oozie.action.hadoop.HDFSCredentials.obtainTokensForNamenodes(HDFSCredentials.java:99)
> at
> org.apache.oozie.action.hadoop.HDFSCredentials.updateCredentials(HDFSCredentials.java:65)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.setCredentialTokens(JavaActionExecutor.java:1546)
> at
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1082)
> ... 9 more
> Caused by: java.net.UnknownHostException
> at
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:664)
> ... 43 more
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)