[ https://issues.apache.org/jira/browse/HDFS-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065106#comment-14065106 ]
Chris Nauroth commented on HDFS-6699: ------------------------------------- Thanks for filing this, Remus. At a high level, the approach of moving the privileged operations into a separate elevated service with a much smaller attack surface makes sense to me. I'll catch up on YARN-2198 to get more context on this new service. > Secure Windows DFS read when client co-located on nodes with data > (short-circuit reads) > --------------------------------------------------------------------------------------- > > Key: HDFS-6699 > URL: https://issues.apache.org/jira/browse/HDFS-6699 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, hdfs-client, performance, security > Reporter: Remus Rusanu > Labels: windows > > HDFS-347 Introduced secure short-circuit HDFS reads based on linux domain > sockets. Similar capability can be introduced in a secure Windows environment > using > [DuplicateHandle](http://msdn.microsoft.com/en-us/library/windows/desktop/ms724251(v=vs.85).aspx) > Win32 API. When short-circuit is allowed the datanode would open the block > file and then duplicate the handle into the hdfs client process and return to > the process the handle value. The hdfs client can then open a Java stream on > this handle and read the file. This is a secure mechanism, the HDFS acls are > validated by the namenode and the process does not gets direct access to the > file in a controlled manner (eg. read-only). The hdfs client process does not > need to have OS level access privilege to the block file. > A complication arises from the requirement to duplicate the handle in the > hdfs client process. Ordinary processes (as we desire datanode to run) do not > have the required privilege (SeDebugPrivilege). But with introduction of an > elevated service helper for the nodemanager Windows Secure Container Executor > (YARN-2198) we have at our disposal an elevated executor that can do the job > of duplicating the handle. The datanode would communicate with this process > using the same mechanism as the nodemanager, ie. LRPC. > With my proposed implementation the sequence of actions is as follows: > - the hdfs client requests Windows secure shortcircuit of a block in the > data transfer protocol. It passes the block, the token and its own process ID. > - datanode approves short-circuit. It opens the block file and obtains the > handle. > - datanode invokes the elevated privilege service to duplicate the handle > into the hdfs client process. datanode invokes the service LRPC interface > over JNI (LRPC being the Windows de-facto standard for interoperating with a > service). It passes the handle valeu, its own process id and the hdfs client > process id. > - The elevated service duplicates the handle from the datanode process into > the hdfs client proces. It returns the duplicate handle value to the datanode > as output value from the LRPC call > - x 2 for CRC file > - the datanode responds to the short circuit datatransfer protocol request > with a message that contains the duplicate handle value (handles actually, x2 > from CRC) > - the hdfs-client creates a Java stream that wraps the handles and reads the > block from this stream (ditto for CRC) > datanode needs to exercise care not to duplicate the same handle to different > clients (including the CRC handles) because a handle abstracts also the file > position and clients would inadvertently move each other file pointer to > chaos results. > TBD a mitigation for process ID reuse (the hdfs client can be terminated > immediately after the block request and a new process could reuse the same > ID) . In theory an attacker could use this as a mechanism to obtain a handle > to a block by killing the hdfs-client at the right moment and swing new > processes until it gets one with the desired ID. I'm not sure is a realistic > threat because the attacker already must have the privilege to kill the hdfs > client process, and having such privilege he could obtain the handle by other > means (eg. debug/inspect hdfs client process). -- This message was sent by Atlassian JIRA (v6.2#6252)