[ https://issues.apache.org/jira/browse/YARN-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147946#comment-14147946 ]
Remus Rusanu commented on YARN-2198: ------------------------------------ Core test failure is: {code} Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 120.538 sec <<< FAILURE! - in org.apache.hadoop.crypto.random.TestOsSecureRandom testOsSecureRandomSetConf(org.apache.hadoop.crypto.random.TestOsSecureRandom) Time elapsed: 120.011 sec <<< ERROR! java.lang.Exception: test timed out after 120000 milliseconds at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:220) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) at java.io.InputStreamReader.read(InputStreamReader.java:167) at java.io.BufferedReader.fill(BufferedReader.java:136) at java.io.BufferedReader.read1(BufferedReader.java:187) at java.io.BufferedReader.read(BufferedReader.java:261) at org.apache.hadoop.util.Shell$ShellCommandExecutor.parseExecResult(Shell.java:727) at org.apache.hadoop.util.Shell.runCommand(Shell.java:524) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:714) at org.apache.hadoop.crypto.random.TestOsSecureRandom.testOsSecureRandomSetConf(TestOsSecureRandom.java:149) {code} > Remove the need to run NodeManager as privileged account for Windows Secure > Container Executor > ---------------------------------------------------------------------------------------------- > > Key: YARN-2198 > URL: https://issues.apache.org/jira/browse/YARN-2198 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Remus Rusanu > Assignee: Remus Rusanu > Labels: security, windows > Attachments: YARN-2198.1.patch, YARN-2198.2.patch, YARN-2198.3.patch, > YARN-2198.delta.4.patch, YARN-2198.delta.5.patch, YARN-2198.delta.6.patch, > YARN-2198.delta.7.patch, YARN-2198.separation.patch, > YARN-2198.trunk.10.patch, YARN-2198.trunk.4.patch, YARN-2198.trunk.5.patch, > YARN-2198.trunk.6.patch, YARN-2198.trunk.8.patch, YARN-2198.trunk.9.patch > > > YARN-1972 introduces a Secure Windows Container Executor. However this > executor requires a the process launching the container to be LocalSystem or > a member of the a local Administrators group. Since the process in question > is the NodeManager, the requirement translates to the entire NM to run as a > privileged account, a very large surface area to review and protect. > This proposal is to move the privileged operations into a dedicated NT > service. The NM can run as a low privilege account and communicate with the > privileged NT service when it needs to launch a container. This would reduce > the surface exposed to the high privileges. > There has to exist a secure, authenticated and authorized channel of > communication between the NM and the privileged NT service. Possible > alternatives are a new TCP endpoint, Java RPC etc. My proposal though would > be to use Windows LPC (Local Procedure Calls), which is a Windows platform > specific inter-process communication channel that satisfies all requirements > and is easy to deploy. The privileged NT service would register and listen on > an LPC port (NtCreatePort, NtListenPort). The NM would use JNI to interop > with libwinutils which would host the LPC client code. The client would > connect to the LPC port (NtConnectPort) and send a message requesting a > container launch (NtRequestWaitReplyPort). LPC provides authentication and > the privileged NT service can use authorization API (AuthZ) to validate the > caller. -- This message was sent by Atlassian JIRA (v6.3.4#6332)