[ 
https://issues.apache.org/jira/browse/YARN-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046853#comment-14046853
 ] 

Remus Rusanu commented on YARN-2198:
------------------------------------

I got this working using LPC, but there are some complications vis-a-vis 
stdout/stderr. With a helper service the nodemanager no longer gets a free 
lunch of accessing the task stdout/stderr. Solutions exists:
 - read stdout/stderr from the helper and pump them over the LPC interface back 
to NM
 - explictly set a .out and .err file for the task and use them as 
stdout/stderr for the container launch.
Note that the problem applies to localizer launch too, which does no have a 
stdout/stderr redirect in the launch script.

Another complication is the Windows job model of NM/winutils. winutils create a 
job for the container and joins the job itself, ensuring a controlled lifespan 
for the all task launched processes. The service helper cannot join the job as 
it has it own, independent, lifespan. I solved this problem by having the 
helper service launch "wintuls task createAsUser ..." as an ordinary 
CreateProcess in the LPC server routine, rather than attempt to do the S4U 
impersonation in the helper service process itself. This works fine, and also 
greatly reduces the risks associated with leaking handles as the heavy work 
(=leak risk) occurs in a sub-process, not in the service.

I will have to investigate if there is any known issue vis-a-vis a very long 
LPC call (winutils waits for the spawned processes to finish). I there is, the 
solution would be for the helper service to hand over the spwaned task to the 
NM (duplicate the process task in the NM, yuck) and have the NM JNI (the LPC 
client) do the actual process handle wait (ie. blocking wait for task to 
finish). This would make the LPC call short (spawn process, duplicate handle, 
return handle to NM) at the risk of some induced complications. Also this would 
make the whole stdout/stderr transfer even more cumbersome if we opt for pipes 
vs. .out/.err files (open by helper process, duplicate it in NM, have the NM 
read the handles...)

> Remove the need to run NodeManager as privileged account for Windows Secure 
> Container Executor
> ----------------------------------------------------------------------------------------------
>
>                 Key: YARN-2198
>                 URL: https://issues.apache.org/jira/browse/YARN-2198
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Remus Rusanu
>            Assignee: Remus Rusanu
>              Labels: security, windows
>
> YARN-1972 introduces a Secure Windows Container Executor. However this 
> executor requires a the process launching the container to be LocalSystem or 
> a member of the a local Administrators group. Since the process in question 
> is the NodeManager, the requirement translates to the entire NM to run as a 
> privileged account, a very large surface area to review and protect.
> This proposal is to move the privileged operations into a dedicated NT 
> service. The NM can run as a low privilege account and communicate with the 
> privileged NT service when it needs to launch a container. This would reduce 
> the surface exposed to the high privileges. 
> There has to exist a secure, authenticated and authorized channel of 
> communication between the NM and the privileged NT service. Possible 
> alternatives are a new TCP endpoint, Java RPC etc. My proposal though would 
> be to use Windows LPC (Local Procedure Calls), which is a Windows platform 
> specific inter-process communication channel that satisfies all requirements 
> and is easy to deploy. The privileged NT service would register and listen on 
> an LPC port (NtCreatePort, NtListenPort). The NM would use JNI to interop 
> with libwinutils which would host the LPC client code. The client would 
> connect to the LPC port (NtConnectPort) and send a message requesting a 
> container launch (NtRequestWaitReplyPort). LPC provides authentication and 
> the privileged NT service can use authorization API (AuthZ) to validate the 
> caller.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to