[jira] [Commented] (YARN-1329) yarn-config.sh overwrites YARN_CONF_DIR indiscriminately
[ https://issues.apache.org/jira/browse/YARN-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534806#comment-14534806 ] Roman Shaposhnik commented on YARN-1329: [~aw] since you re-wrote the shell logic, I am pretty sure this can be closed as 'no longer applicable'. Objections? yarn-config.sh overwrites YARN_CONF_DIR indiscriminately - Key: YARN-1329 URL: https://issues.apache.org/jira/browse/YARN-1329 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Reporter: Aaron Gottlieb Assignee: haosdent Labels: BB2015-05-TBR, easyfix Attachments: YARN-1329.patch The script yarn-daemons.sh calls {code}${HADOOP_LIBEXEC_DIR}/yarn-config.sh{code} yarn-config.sh overwrites any previously set value of environment variable YARN_CONF_DIR starting at line 40: {code:title=yarn-config.sh|borderStyle=solid} #check to see if the conf dir is given as an optional argument if [ $# -gt 1 ] then if [ --config = $1 ] then shift confdir=$1 shift YARN_CONF_DIR=$confdir fi fi # Allow alternate conf dir location. export YARN_CONF_DIR=${HADOOP_CONF_DIR:-$HADOOP_YARN_HOME/conf} {code} The last line should check for the existence of YARN_CONF_DIR first. {code} DEFAULT_CONF_DIR=${HADOOP_CONF_DIR:-$YARN_HOME/conf} export YARN_CONF_DIR=${YARN_CONF_DIR:-$DEFAULT_CONF_DIR} {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1050) Document the Fair Scheduler REST API
[ https://issues.apache.org/jira/browse/YARN-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534832#comment-14534832 ] Roman Shaposhnik commented on YARN-1050: Looking at low hanging fruit as part of the bugbash -- let me know if I can help with this one Document the Fair Scheduler REST API Key: YARN-1050 URL: https://issues.apache.org/jira/browse/YARN-1050 Project: Hadoop YARN Issue Type: Improvement Components: documentation, fairscheduler Reporter: Sandy Ryza Assignee: Kenji Kikushima Labels: BB2015-05-TBR Attachments: YARN-1050-2.patch, YARN-1050-3.patch, YARN-1050.patch The documentation should be placed here along with the Capacity Scheduler documentation: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Scheduler_API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-888) clean up POM dependencies
[ https://issues.apache.org/jira/browse/YARN-888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865830#comment-13865830 ] Roman Shaposhnik commented on YARN-888: --- +1 clean up POM dependencies - Key: YARN-888 URL: https://issues.apache.org/jira/browse/YARN-888 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: YARN-888.patch, yarn-888-2.patch Intermediate 'pom' modules define dependencies inherited by leaf modules. This is causing issues in intellij IDE. We should normalize the leaf modules like in common, hdfs and tools where all dependencies are defined in each leaf module and the intermediate 'pom' module do not define any dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783250#comment-13783250 ] Roman Shaposhnik commented on YARN-1253: [~vinodkv] I think changing the name of the JIRA is a fair request. Done. Changes to LinuxContainerExecutor to use cgroups in unsecure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1253) Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated YARN-1253: --- Summary: Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode (was: Changes to LinuxContainerExecutor to run containers as a single dedicated user in unsecure mode) Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1253) Changes to LinuxContainerExecutor to run containers as a single dedicated user in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated YARN-1253: --- Summary: Changes to LinuxContainerExecutor to run containers as a single dedicated user in unsecure mode (was: Changes to LinuxContainerExecutor to use cgroups in unsecure mode) Changes to LinuxContainerExecutor to run containers as a single dedicated user in unsecure mode --- Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1253) Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated YARN-1253: --- Attachment: YARN-1253.patch.txt Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker Attachments: YARN-1253.patch.txt When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783288#comment-13783288 ] Roman Shaposhnik commented on YARN-1253: Attached a patch. Please review. I also deployed a cluster with the patches LCE and tried it out running simple jobs. Everything seems to work as expected. Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker Attachments: YARN-1253.patch.txt When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783626#comment-13783626 ] Roman Shaposhnik commented on YARN-1253: Here's a link to a recap on cgroups decisions at FD: http://www.freedesktop.org/wiki/Software/systemd/PaxControlGroups/ Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker Attachments: YARN-1253.patch.txt When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783624#comment-13783624 ] Roman Shaposhnik commented on YARN-1253: [~acmurthy] on the subject of moving cgroups management functionality to a standalone module -- it may be of interest to note that Linux community has recently thrown in the towel on making naked cgroups trees be the nexus for resource management. Basically, there is a realization that no matter how much Pax Groupiana they write this is simply not a reliable approach. Their current approach is to provide resource management arbitration as a centralized service (under the covers, of course, using naked cgroups trees) via systemd and make all the clients talk to systemd via the DBus api to request resources, etc. It may be interesting to exploit this angle as well and have an implementation that bypasses the native part for cgroups altogether making YARN Java implementation talk to systemd (or an equivalent). Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker Attachments: YARN-1253.patch.txt When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782571#comment-13782571 ] Roman Shaposhnik commented on YARN-1253: I've started doing some preliminary work on this JIRA, so hopefully I can explain some of the things that my patch is about to address: # the reason to use LCE in a non-secure mode is to be able to take advantage of cgroups mechanism, now perhaps cgroups functionality should be independent from the rest of LCE functionality, but re-using the current LCE design is also quite easy -- hence lets assume that for cgroups we need LCE # in a fully secure deployment, LCE works perfectly and makes YARN users correspond 1-1 with the local UNIX users provisioned on each worker node # in a non-secure deployment this 1-1 correspondence feels like a burden that doesn't necessarily have to be there Thus, the proposal is really to add a tiny bit of functionality to LCE where in a non-secure case it would be able to run all tasks under a single designated user (different from a user running nodemanager). On top of that, the notion of the YARN user (which no longer has to have a corresponding UNIX user) get preserved in everything else that LCE does (which really boils down to paths in the local filesystem used for localization). Changes to LinuxContainerExecutor to use cgroups in unsecure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1247) test-container-executor has gotten out of sync with the changes to container-executor
Roman Shaposhnik created YARN-1247: -- Summary: test-container-executor has gotten out of sync with the changes to container-executor Key: YARN-1247 URL: https://issues.apache.org/jira/browse/YARN-1247 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 2.1.2-beta If run under the super-user account test-container-executor.c fails in multiple different places. It would be nice to fix it so that we have better testing of LCE functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1247) test-container-executor has gotten out of sync with the changes to container-executor
[ https://issues.apache.org/jira/browse/YARN-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated YARN-1247: --- Attachment: 0001-YARN-1247.-test-container-executor-has-gotten-out-of.patch Attaching a patch that refactors the test code somewhat. There's still tons to do but at least it no longer fails when run as root and all the checks pass. Also, added a small cgroups specific check. test-container-executor has gotten out of sync with the changes to container-executor - Key: YARN-1247 URL: https://issues.apache.org/jira/browse/YARN-1247 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 2.1.2-beta Attachments: 0001-YARN-1247.-test-container-executor-has-gotten-out-of.patch If run under the super-user account test-container-executor.c fails in multiple different places. It would be nice to fix it so that we have better testing of LCE functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1247) test-container-executor has gotten out of sync with the changes to container-executor
[ https://issues.apache.org/jira/browse/YARN-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated YARN-1247: --- Fix Version/s: (was: 2.1.2-beta) test-container-executor has gotten out of sync with the changes to container-executor - Key: YARN-1247 URL: https://issues.apache.org/jira/browse/YARN-1247 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Attachments: 0001-YARN-1247.-test-container-executor-has-gotten-out-of.patch If run under the super-user account test-container-executor.c fails in multiple different places. It would be nice to fix it so that we have better testing of LCE functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1194) TestContainerLogsPage test fails on trunk
Roman Shaposhnik created YARN-1194: -- Summary: TestContainerLogsPage test fails on trunk Key: YARN-1194 URL: https://issues.apache.org/jira/browse/YARN-1194 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.3.0 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Priority: Minor Running TestContainerLogsPage on trunk while Native IO is enabled makes it fail -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1194) TestContainerLogsPage test fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated YARN-1194: --- Attachment: YARN-1194.patch.txt TestContainerLogsPage test fails on trunk - Key: YARN-1194 URL: https://issues.apache.org/jira/browse/YARN-1194 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.3.0 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Priority: Minor Attachments: YARN-1194.patch.txt Running TestContainerLogsPage on trunk while Native IO is enabled makes it fail -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1194) TestContainerLogsPage fails with native builds
[ https://issues.apache.org/jira/browse/YARN-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766638#comment-13766638 ] Roman Shaposhnik commented on YARN-1194: [~jlowe] thanks a lot for a quick review/commit! TestContainerLogsPage fails with native builds -- Key: YARN-1194 URL: https://issues.apache.org/jira/browse/YARN-1194 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.3.0 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Priority: Minor Fix For: 3.0.0, 2.1.1-beta Attachments: YARN-1194.patch.txt Running TestContainerLogsPage on trunk while Native IO is enabled makes it fail -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-1162) NM auxiliary service invocations should be try/catch
[ https://issues.apache.org/jira/browse/YARN-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik reassigned YARN-1162: -- Assignee: Roman Shaposhnik NM auxiliary service invocations should be try/catch Key: YARN-1162 URL: https://issues.apache.org/jira/browse/YARN-1162 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Critical Fix For: 2.1.1-beta The {{AuxiliaryServices#handle()}} should try/catch all invocations of auxiliary services to isolate failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-888) clean up POM dependencies
[ https://issues.apache.org/jira/browse/YARN-888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13751401#comment-13751401 ] Roman Shaposhnik commented on YARN-888: --- [~tucu00] I'm cleaning up a few build issues (Bigtop 0.8.0 related) and I was wondering whether you'd think it would be a good idea for me to tackle this one as well. Please assign it to me if you think it is. [~tstclair] could you please link the JIRAs you've mentioned? As I said -- I'm trying to clean up the build related stuff for ease of integration in Bigtop 0.8.0 clean up POM dependencies - Key: YARN-888 URL: https://issues.apache.org/jira/browse/YARN-888 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Intermediate 'pom' modules define dependencies inherited by leaf modules. This is causing issues in intellij IDE. We should normalize the leaf modules like in common, hdfs and tools where all dependencies are defined in each leaf module and the intermediate 'pom' module do not define any dependency. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-509) ResourceTrackerPB misses KerberosInfo annotation which renders YARN unusable on secure clusters
[ https://issues.apache.org/jira/browse/YARN-509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616415#comment-13616415 ] Roman Shaposhnik commented on YARN-509: --- I totally agree that it needs to be investigated. That said, if we have to rush 2.0.4-alpha I'd say the proposed patch might be a reasonable workaround. ResourceTrackerPB misses KerberosInfo annotation which renders YARN unusable on secure clusters --- Key: YARN-509 URL: https://issues.apache.org/jira/browse/YARN-509 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.1-alpha Environment: BigTop Kerberized cluster test environment Reporter: Konstantin Boudnik Priority: Blocker Fix For: 3.0.0, 2.0.4-alpha Attachments: YARN-509.patch.txt During BigTop 0.6.0 release test cycle, [~rvs] came around the following problem: {noformat} 013-03-26 15:37:03,573 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager org.apache.hadoop.yarn.YarnException: Failed to Start org.apache.hadoop.yarn.server.nodemanager.NodeManager at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:322) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:359) Caused by: org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:162) at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) ... 3 more Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:61) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:158) ... 4 more Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB, expected client Kerberos principal is yarn/ip-10-46-37-244.ec2.internal@BIGTOP at org.apache.hadoop.ipc.Client.call(Client.java:1235) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at $Proxy26.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59) ... 6 more {noformat} The most significant part is {{User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB}} indicating that ResourceTrackerPB hasn't been annotated with {{@KerberosInfo}} nor {{@TokenInfo}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-509) $var shell substitution in properties are not expanded in hadoop-policy.xml
[ https://issues.apache.org/jira/browse/YARN-509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated YARN-509: -- Summary: $var shell substitution in properties are not expanded in hadoop-policy.xml (was: ResourceTrackerPB misses KerberosInfo annotation which renders YARN unusable on secure clusters) $var shell substitution in properties are not expanded in hadoop-policy.xml --- Key: YARN-509 URL: https://issues.apache.org/jira/browse/YARN-509 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.1-alpha Environment: BigTop Kerberized cluster test environment Reporter: Konstantin Boudnik Priority: Blocker Fix For: 3.0.0, 2.0.4-alpha Attachments: YARN-509.patch.txt During BigTop 0.6.0 release test cycle, [~rvs] came around the following problem: {noformat} 013-03-26 15:37:03,573 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager org.apache.hadoop.yarn.YarnException: Failed to Start org.apache.hadoop.yarn.server.nodemanager.NodeManager at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:322) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:359) Caused by: org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:162) at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) ... 3 more Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:61) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:158) ... 4 more Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB, expected client Kerberos principal is yarn/ip-10-46-37-244.ec2.internal@BIGTOP at org.apache.hadoop.ipc.Client.call(Client.java:1235) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at $Proxy26.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59) ... 6 more {noformat} The most significant part is {{User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB}} indicating that ResourceTrackerPB hasn't been annotated with {{@KerberosInfo}} nor {{@TokenInfo}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-509) $var shell substitution in properties are not expanded in hadoop-policy.xml
[ https://issues.apache.org/jira/browse/YARN-509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616685#comment-13616685 ] Roman Shaposhnik commented on YARN-509: --- Guys, I've updated the the description of the JIRA to be better reflect the latest findings. I'm leaving it as a blocker for now expecting somebody else to chime in and propose whether we apply a patch I provide or RELNOTE this if there's not enough time to get to the bottom of the issue. $var shell substitution in properties are not expanded in hadoop-policy.xml --- Key: YARN-509 URL: https://issues.apache.org/jira/browse/YARN-509 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.1-alpha Environment: BigTop Kerberized cluster test environment Reporter: Konstantin Boudnik Priority: Blocker Fix For: 3.0.0, 2.0.4-alpha Attachments: YARN-509.patch.txt During BigTop 0.6.0 release test cycle, [~rvs] came around the following problem: {noformat} 013-03-26 15:37:03,573 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager org.apache.hadoop.yarn.YarnException: Failed to Start org.apache.hadoop.yarn.server.nodemanager.NodeManager at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:322) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:359) Caused by: org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:162) at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) ... 3 more Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:61) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:158) ... 4 more Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB, expected client Kerberos principal is yarn/ip-10-46-37-244.ec2.internal@BIGTOP at org.apache.hadoop.ipc.Client.call(Client.java:1235) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at $Proxy26.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59) ... 6 more {noformat} The most significant part is {{User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB}} indicating that ResourceTrackerPB hasn't been annotated with {{@KerberosInfo}} nor {{@TokenInfo}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-509) ResourceTrackerPB misses KerberosInfo annotation which renders YARN unusable on secure clusters
[ https://issues.apache.org/jira/browse/YARN-509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated YARN-509: -- Attachment: YARN-509.patch.txt So it appears that $substitution in hadoop-policy.xml is broken. I propose that we simply change all the $value into '*' which will make the entries in hadoop-policy.xml fully consistent and also seems to fix the problem. Patch attached. ResourceTrackerPB misses KerberosInfo annotation which renders YARN unusable on secure clusters --- Key: YARN-509 URL: https://issues.apache.org/jira/browse/YARN-509 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.1-alpha Environment: BigTop Kerberized cluster test environment Reporter: Konstantin Boudnik Priority: Blocker Fix For: 3.0.0, 2.0.4-alpha Attachments: YARN-509.patch.txt During BigTop 0.6.0 release test cycle, [~rvs] came around the following problem: {noformat} 013-03-26 15:37:03,573 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager org.apache.hadoop.yarn.YarnException: Failed to Start org.apache.hadoop.yarn.server.nodemanager.NodeManager at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:322) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:359) Caused by: org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:162) at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) ... 3 more Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:61) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:158) ... 4 more Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB, expected client Kerberos principal is yarn/ip-10-46-37-244.ec2.internal@BIGTOP at org.apache.hadoop.ipc.Client.call(Client.java:1235) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at $Proxy26.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59) ... 6 more {noformat} The most significant part is {{User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB}} indicating that ResourceTrackerPB hasn't been annotated with {{@KerberosInfo}} nor {{@TokenInfo}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-509) ResourceTrackerPB misses KerberosInfo annotation which renders YARN unusable on secure clusters
[ https://issues.apache.org/jira/browse/YARN-509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated YARN-509: -- Attachment: YARN-509.patch.txt ResourceTrackerPB misses KerberosInfo annotation which renders YARN unusable on secure clusters --- Key: YARN-509 URL: https://issues.apache.org/jira/browse/YARN-509 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.1-alpha Environment: BigTop Kerberized cluster test environment Reporter: Konstantin Boudnik Priority: Blocker Fix For: 3.0.0, 2.0.4-alpha Attachments: YARN-509.patch.txt During BigTop 0.6.0 release test cycle, [~rvs] came around the following problem: {noformat} 013-03-26 15:37:03,573 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager org.apache.hadoop.yarn.YarnException: Failed to Start org.apache.hadoop.yarn.server.nodemanager.NodeManager at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:322) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:359) Caused by: org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:162) at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) ... 3 more Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:61) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:158) ... 4 more Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB, expected client Kerberos principal is yarn/ip-10-46-37-244.ec2.internal@BIGTOP at org.apache.hadoop.ipc.Client.call(Client.java:1235) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at $Proxy26.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59) ... 6 more {noformat} The most significant part is {{User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB}} indicating that ResourceTrackerPB hasn't been annotated with {{@KerberosInfo}} nor {{@TokenInfo}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-509) ResourceTrackerPB misses KerberosInfo annotation which renders YARN unusable on secure clusters
[ https://issues.apache.org/jira/browse/YARN-509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated YARN-509: -- Attachment: (was: YARN-509.patch.txt) ResourceTrackerPB misses KerberosInfo annotation which renders YARN unusable on secure clusters --- Key: YARN-509 URL: https://issues.apache.org/jira/browse/YARN-509 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.1-alpha Environment: BigTop Kerberized cluster test environment Reporter: Konstantin Boudnik Priority: Blocker Fix For: 3.0.0, 2.0.4-alpha Attachments: YARN-509.patch.txt During BigTop 0.6.0 release test cycle, [~rvs] came around the following problem: {noformat} 013-03-26 15:37:03,573 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager org.apache.hadoop.yarn.YarnException: Failed to Start org.apache.hadoop.yarn.server.nodemanager.NodeManager at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:322) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:359) Caused by: org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:162) at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) ... 3 more Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:61) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:158) ... 4 more Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB, expected client Kerberos principal is yarn/ip-10-46-37-244.ec2.internal@BIGTOP at org.apache.hadoop.ipc.Client.call(Client.java:1235) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at $Proxy26.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59) ... 6 more {noformat} The most significant part is {{User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB}} indicating that ResourceTrackerPB hasn't been annotated with {{@KerberosInfo}} nor {{@TokenInfo}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-509) ResourceTrackerPB misses KerberosInfo annotation which renders YARN unusable on secure clusters
[ https://issues.apache.org/jira/browse/YARN-509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614710#comment-13614710 ] Roman Shaposhnik commented on YARN-509: --- This is from Bigtop testing so I can make the cluster available for you (I'll need your public ssh key -- please send it to me offline pref. PGP encoded). Now, to answer your questions: bq. What is security.resourcetracker.protocol.acl set to in your hadoop-policy.xml? ${HADOOP_YARN_USER} which acording to the process environment translates to yarn bq. What is yarn.nodemanager.principal in yarn-site.xml ? yarn/_HOST@BIGTOP bq. RMNMSecurityInfoClass.class and the text file org.apache.hadoop.security.SecurityInfo are on the classpath of ResourceManager? Yes it is. Please let me know if you need any more info or if you'd like to get access to the cluster. ResourceTrackerPB misses KerberosInfo annotation which renders YARN unusable on secure clusters --- Key: YARN-509 URL: https://issues.apache.org/jira/browse/YARN-509 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.1-alpha Environment: BigTop Kerberized cluster test environment Reporter: Konstantin Boudnik Priority: Blocker Fix For: 3.0.0, 2.0.4-alpha During BigTop 0.6.0 release test cycle, [~rvs] came around the following problem: {noformat} 013-03-26 15:37:03,573 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager org.apache.hadoop.yarn.YarnException: Failed to Start org.apache.hadoop.yarn.server.nodemanager.NodeManager at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:322) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:359) Caused by: org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:162) at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) ... 3 more Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:61) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:199) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:158) ... 4 more Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB, expected client Kerberos principal is yarn/ip-10-46-37-244.ec2.internal@BIGTOP at org.apache.hadoop.ipc.Client.call(Client.java:1235) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at $Proxy26.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59) ... 6 more {noformat} The most significant part is {{User yarn/ip-10-46-37-244.ec2.internal@BIGTOP (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB}} indicating that ResourceTrackerPB hasn't been annotated with {{@KerberosInfo}} nor {{@TokenInfo}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-253) Container launch may fail if no files were localized
[ https://issues.apache.org/jira/browse/YARN-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated YARN-253: -- Priority: Blocker (was: Major) Container launch may fail if no files were localized Key: YARN-253 URL: https://issues.apache.org/jira/browse/YARN-253 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.2-alpha Reporter: Tom White Assignee: Tom White Priority: Blocker Attachments: YARN-253.patch, YARN-253-test.patch This can be demonstrated with DistributedShell. The containers running the shell do not have any files to localize (if there is no shell script to copy) so if they run on a different NM to the AM (which does localize files), then they will fail since the appcache directory does not exist. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-9) Rename YARN_HOME to HADOOP_YARN_HOME
[ https://issues.apache.org/jira/browse/YARN-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454338#comment-13454338 ] Roman Shaposhnik commented on YARN-9: - FYI: Bigtop will have to adapt to this change, but I'm +/-0 on whether YARN_HOME needs to be supported. I suppose some existing users might have it hardcoded. Rename YARN_HOME to HADOOP_YARN_HOME Key: YARN-9 URL: https://issues.apache.org/jira/browse/YARN-9 Project: Hadoop YARN Issue Type: Improvement Reporter: Arun C Murthy Assignee: Vinod Kumar Vavilapalli Attachments: YARN-9-20120912.txt We should rename YARN_HOME to HADOOP_YARN_HOME to be consistent with rest of Hadoop sub-projects. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira