[ https://issues.apache.org/jira/browse/YARN-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14155127#comment-14155127 ]
Vinod Kumar Vavilapalli commented on YARN-1972: ----------------------------------------------- bq. Remus Rusanu Vinod Kumar Vavilapalli, as on YARN-1063, we can go ahead and address these comments as part of the YARN-2198 effort, it's not necessary to resolve these before these patches are committed. +1 for tracking the remaining issues at YARN-1063. This looks good, checking this in. > Implement secure Windows Container Executor > ------------------------------------------- > > Key: YARN-1972 > URL: https://issues.apache.org/jira/browse/YARN-1972 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager > Reporter: Remus Rusanu > Assignee: Remus Rusanu > Labels: security, windows > Attachments: YARN-1972.1.patch, YARN-1972.2.patch, YARN-1972.3.patch, > YARN-1972.delta.4.patch, YARN-1972.delta.5.patch, YARN-1972.trunk.4.patch, > YARN-1972.trunk.5.patch > > > h1. Windows Secure Container Executor (WCE) > YARN-1063 adds the necessary infrasturcture to launch a process as a domain > user as a solution for the problem of having a security boundary between > processes executed in YARN containers and the Hadoop services. The WCE is a > container executor that leverages the winutils capabilities introduced in > YARN-1063 and launches containers as an OS process running as the job > submitter user. A description of the S4U infrastructure used by YARN-1063 > alternatives considered can be read on that JIRA. > The WCE is based on the DefaultContainerExecutor. It relies on the DCE to > drive the flow of execution, but it overwrrides some emthods to the effect of: > * change the DCE created user cache directories to be owned by the job user > and by the nodemanager group. > * changes the actual container run command to use the 'createAsUser' command > of winutils task instead of 'create' > * runs the localization as standalone process instead of an in-process Java > method call. This in turn relies on the winutil createAsUser feature to run > the localization as the job user. > > When compared to LinuxContainerExecutor (LCE), the WCE has some minor > differences: > * it does no delegate the creation of the user cache directories to the > native implementation. > * it does no require special handling to be able to delete user files > The approach on the WCE came from a practical trial-and-error approach. I had > to iron out some issues around the Windows script shell limitations (command > line length) to get it to work, the biggest issue being the huge CLASSPATH > that is commonplace in Hadoop environment container executions. The job > container itself is already dealing with this via a so called 'classpath > jar', see HADOOP-8899 and YARN-316 for details. For the WCE localizer launch > as a separate container the same issue had to be resolved and I used the same > 'classpath jar' approach. > h2. Deployment Requirements > To use the WCE one needs to set the > `yarn.nodemanager.container-executor.class` to > `org.apache.hadoop.yarn.server.nodemanager.WindowsSecureContainerExecutor` > and set the `yarn.nodemanager.windows-secure-container-executor.group` to a > Windows security group name that is the nodemanager service principal is a > member of (equivalent of LCE > `yarn.nodemanager.linux-container-executor.group`). Unlike the LCE the WCE > does not require any configuration outside of the Hadoop own's yar-site.xml. > For WCE to work the nodemanager must run as a service principal that is > member of the local Administrators group or LocalSystem. this is derived from > the need to invoke LoadUserProfile API which mention these requirements in > the specifications. This is in addition to the SE_TCB privilege mentioned in > YARN-1063, but this requirement will automatically imply that the SE_TCB > privilege is held by the nodemanager. For the Linux speakers in the audience, > the requirement is basically to run NM as root. > h2. Dedicated high privilege Service > Due to the high privilege required by the WCE we had discussed the need to > isolate the high privilege operations into a separate process, an 'executor' > service that is solely responsible to start the containers (incloding the > localizer). The NM would have to authenticate, authorize and communicate with > this service via an IPC mechanism and use this service to launch the > containers. I still believe we'll end up deploying such a service, but the > effort to onboard such a new platfrom specific new service on the project are > not trivial. -- This message was sent by Atlassian JIRA (v6.3.4#6332)