[ https://issues.apache.org/jira/browse/HDFS-4680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Wang updated HDFS-4680: ------------------------------ Resolution: Fixed Fix Version/s: 2.3.0 Status: Resolved (was: Patch Available) Thanks ATM (and everyone who's looked at this), committed to trunk and branch-2. > Audit logging of delegation tokens for MR tracing > ------------------------------------------------- > > Key: HDFS-4680 > URL: https://issues.apache.org/jira/browse/HDFS-4680 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security > Affects Versions: 2.0.3-alpha > Reporter: Andrew Wang > Assignee: Andrew Wang > Fix For: 2.3.0 > > Attachments: hdfs-4680-1.patch, hdfs-4680-2.patch, hdfs-4680-3.patch, > hdfs-4680-4.patch, hdfs-4680-5.patch > > > HDFS audit logging tracks HDFS operations made by different users, e.g. > creation and deletion of files. This is useful for after-the-fact root cause > analysis and security. However, logging merely the username is insufficient > for many usecases. For instance, it is common for a single user to run > multiple MapReduce jobs (I believe this is the case with Hive). In this > scenario, given a particular audit log entry, it is difficult to trace it > back to the MR job or task that generated that entry. > I see a number of potential options for implementing this. > 1. Make an optional "client name" field part of the NN RPC format. We already > pass a {{clientName}} as a parameter in many RPC calls, so this would > essentially make it standardized. MR tasks could then set this field to the > job and task ID. > 2. This could be generalized to a set of optional key-value *tags* in the NN > RPC format, which would then be audit logged. This has standalone benefits > outside of just verifying MR task ids. > 3. Neither of the above two options actually securely verify that MR clients > are who they claim they are. Doing this securely requires the JobTracker to > sign MR task attempts, and then having the NN verify this signature. However, > this is substantially more work, and could be built on after idea #2. > Thoughts welcomed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira