[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923733#action_12923733 ]
Carl Steinbach commented on HIVE-78: ------------------------------------ The issue that Todd raised is pretty important and needs to be addressed in the proposal. My personal opinion is that running all queries as a "hive" super-user is the most practical approach and will also yield behavior that is familiar to users of traditional RDBMS systems (who I expect will increasingly define the average Hive user/administrator). There are some other follow-on issues that need to be decided if we end up settling on this approach: * This approach to authorization presupposes that users are accessing Hive through a HiveServer process. This follows from the fact that A) you want Hive to execute the query plans as the Hive superuser, and B) that user can circumvent the authorization model if they are given direct access to the MetaStore DB. It would be nice if the proposal explicitly stated this requirement and mentioned some of the follow-on work that this necessitates, e.g. fixing concurrency issues in HiveServer, reducing the memory requirements of HiveServer, etc. * We need to apply the authorization model to the '{{add [archive|file|jar]}}' commands as well as {{add temorary function}}. {{add jar}} and {{add file}} both currently allow the user to inject code into MR jobs, and {{add jar}} in conjunction with {{add temporary function}} allows the user to inject and execute arbitrary code within the HiveServer process. We may also want to add a new {{add executable}} command for adding executable scripts that has a different permission model than {{add file}}. * I think there also may be security issues stemming from external tables, e.g. if I create an external table that points to another user's home directory and then run a query on it which executes with Hive's superuser permissions. * Loading date into the Hive warehouse from an arbitrary HDFS location and exporting data to other locations in HDFS are two issues that need to be considered. In each case I think the correct behavior depends on both the Hive process's permissions and those of the user. > Authorization infrastructure for Hive > ------------------------------------- > > Key: HIVE-78 > URL: https://issues.apache.org/jira/browse/HIVE-78 > Project: Hive > Issue Type: New Feature > Components: Metastore, Query Processor, Server Infrastructure > Reporter: Ashish Thusoo > Assignee: He Yongqiang > Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, > hive-78-syntax-v1.patch, hive-78.diff > > > Allow hive to integrate with existing user repositories for authentication > and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.