[ 
https://issues.apache.org/jira/browse/YARN-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911585#comment-16911585
 ] 

Eric Yang commented on YARN-9755:
---------------------------------

[~Prabhu Joseph] Thank you for patch 003.  Mapreduce job can run fine, but 
there seems to be another problem.  When trying to submit yarn service 
application, the request hang with this exception:

{code}
2019-08-20 17:06:04,996 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth 
successful for rm/eyang-2.openstacklo...@example.com (auth:KERBEROS)
2019-08-20 17:06:05,003 INFO org.apache.hadoop.ipc.Server: Connection from 
172.26.111.18:34295 for protocol 
org.apache.hadoop.yarn.api.ApplicationClientProtocolPB is unauthorized for user 
hbase (auth:PROXY) via rm/eyang-2.openstacklo...@example.com (auth:KERBEROS)
2019-08-20 17:06:05,004 INFO org.apache.hadoop.io.retry.RetryInvocationHandler: 
org.apache.hadoop.security.authorize.AuthorizationException: User: 
rm/eyang-2.openstacklo...@example.com is not allowed to impersonate hbase, 
while invoking ApplicationClientProtocolPBClientImpl.getApplications over rm2 
after 1 failover attempts. Trying to failover after sleeping for 35311ms.
2019-08-20 17:06:10,568 INFO 
org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider: Failing over 
to rm2
{code}

After a few retries, Resource manager logs error with:

{code}
2019-08-20 17:09:21,360 ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: 
Failed to create service c: {}
java.net.ConnectException: Call From eyang-2.openstacklocal/172.26.111.18 to 
eyang-1.openstacklocal:8032 failed on connection exception: 
java.net.ConnectException: Connection refused; For more details see:  
http://wiki.apache.org/hadoop/ConnectionRefused
        at sun.reflect.GeneratedConstructorAccessor70.newInstance(Unknown 
Source)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:837)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:757)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1557)
        at org.apache.hadoop.ipc.Client.call(Client.java:1499)
        at org.apache.hadoop.ipc.Client.call(Client.java:1396)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
        at com.sun.proxy.$Proxy96.getApplications(Unknown Source)
        at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplications(ApplicationClientProtocolPBClientImpl.java:316)
        at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
        at com.sun.proxy.$Proxy97.getApplications(Unknown Source)
        at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplications(YarnClientImpl.java:629)
        at 
org.apache.hadoop.yarn.service.client.ServiceClient.verifyNoLiveAppInRM(ServiceClient.java:952)
        at 
org.apache.hadoop.yarn.service.client.ServiceClient.actionCreate(ServiceClient.java:555)
        at 
org.apache.hadoop.yarn.service.webapp.ApiServer$2.run(ApiServer.java:142)
        at 
org.apache.hadoop.yarn.service.webapp.ApiServer$2.run(ApiServer.java:135)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
        at 
org.apache.hadoop.yarn.service.webapp.ApiServer.createService(ApiServer.java:135)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
        at 
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
        at 
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
        at 
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
        at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
        at 
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
        at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
        at 
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
        at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
        at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
        at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
        at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
        at 
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
        at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
        at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
        at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1780)
        at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89)
        at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
        at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
        at 
org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:180)
        at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
        at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
        at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
        at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
        at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
        at 
com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
        at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
        at 
org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
        at 
org.apache.hadoop.security.http.CrossOriginFilter.doFilter(CrossOriginFilter.java:98)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
        at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
        at 
org.apache.hadoop.security.authentication.server.ProxyUserAuthenticationFilter.doFilter(ProxyUserAuthenticationFilter.java:104)
        at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
        at 
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1645)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
        at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
        at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
        at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
        at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
        at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
        at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
        at org.eclipse.jetty.server.Server.handle(Server.java:539)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
        at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
{code}

This feature creates a circular dependency.  "fs" object used by RM wants to 
use HDFS copy of YARN configuration, but the proxy ACL list is in 
common-site.xml.  RM becomes unable to perform impersonation because it does 
not have ACL config to perform hdfs operation to fetch configuration.  The 
workaround is to ensure fs object used by RM uses the bootstrapConf instead of 
HDFS copy of YARN configuration.  However, the workaround may defeat the 
purpose of this feature unless we are very careful about the usage of this 
configuration is for applications that runs in YARN framework only.  YARN 
framework itself uses the bootstrap copy.  Is this a statement that we can 
confirm?

> RM fails to start with FileSystemBasedConfigurationProvider
> -----------------------------------------------------------
>
>                 Key: YARN-9755
>                 URL: https://issues.apache.org/jira/browse/YARN-9755
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 3.3.0
>            Reporter: Prabhu Joseph
>            Assignee: Prabhu Joseph
>            Priority: Major
>         Attachments: YARN-9755-001.patch, YARN-9755-002.patch, 
> YARN-9755-003.patch
>
>
> RM fails to start with below exception when 
> FileSystemBasedConfigurationProvider is used.
> *Exception:*
> {code}
> 2019-08-16 12:05:33,802 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
> ResourceManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: 
> java.io.IOException: Filesystem closed
>         at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)
>         at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:868)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1281)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(ResourceManager.java:1312)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1335)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1328)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1328)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1379)
>         at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1567)
> Caused by: java.io.IOException: java.io.IOException: Filesystem closed
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:64)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:346)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:445)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>         ... 14 more
> Caused by: java.io.IOException: Filesystem closed
>         at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:475)
>         at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1682)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1598)
>         at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1701)
>         at 
> org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider.getConfigurationInputStream(FileSystemBasedConfigurationProvider.java:62)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:56)
> {code}
> FileSystemBasedConfigurationProvider uses the cached FileSystem causing the 
> issue.
> *Configs:*
> {code}
> <property><name>yarn.resourcemanager.configuration.provider-class</name><value>org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider</value></property>
> <property><name>yarn.resourcemanager.configuration.file-system-based-store</name><value>/yarn/conf</value></property>
> [yarn@yarndocker-1 yarn]$ hadoop fs -ls /yarn/conf
> -rw-r--r--   3 yarn supergroup       4138 2019-08-16 13:09 
> /yarn/conf/capacity-scheduler.xml
> -rw-r--r--   3 yarn supergroup        494 2019-08-16 11:41 
> /yarn/conf/core-site.xml
> -rw-r--r--   3 yarn supergroup      11392 2019-08-16 11:52 
> /yarn/conf/hadoop-policy.xml
> -rw-r--r--   3 yarn supergroup      11492 2019-08-16 11:41 
> /yarn/conf/yarn-site.xml
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to