[jira] Commented: (HIVE-78) Authorization infrastructure for Hive

Carl Steinbach (JIRA) Thu, 21 Oct 2010 18:44:41 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923733#action_12923733
 ]


Carl Steinbach commented on HIVE-78:
------------------------------------

The issue that Todd raised is pretty important and needs to be addressed in the 
proposal.
My personal opinion is that running all queries as a "hive" super-user is the 
most
practical approach and will also yield behavior that is familiar to users of 
traditional
RDBMS systems (who I expect will increasingly define the average Hive 
user/administrator).

There are some other follow-on issues that need to be decided if we end up 
settling
on this approach:

* This approach to authorization presupposes that users are accessing Hive 
through a HiveServer process. This follows from the fact that A) you want Hive 
to execute the query plans as the Hive superuser, and B) that user can 
circumvent the authorization model if they are given direct access to the 
MetaStore DB. It would be nice if the proposal explicitly stated this 
requirement and mentioned some of the follow-on work that this necessitates, 
e.g. fixing concurrency issues in HiveServer, reducing the memory requirements 
of HiveServer, etc.

* We need to apply the authorization model to the '{{add [archive|file|jar]}}' 
commands as well as {{add temorary function}}. {{add jar}} and {{add file}} 
both currently allow the user to inject code into MR jobs, and {{add jar}} in 
conjunction with {{add temporary function}} allows the user to inject and 
execute arbitrary code within the HiveServer process. We may also want to add a 
new {{add executable}} command for adding executable scripts that has a 
different permission model than {{add file}}.

* I think there also may be security issues stemming from external tables, e.g. 
if I create an external table that points to another user's home directory and 
then run a query on it which executes with Hive's superuser permissions.

* Loading date into the Hive warehouse from an arbitrary HDFS location and 
exporting data to other locations in HDFS are two issues that need to be 
considered. In each case I think the correct behavior depends on both the Hive 
process's permissions and those of the user.




> Authorization infrastructure for Hive
> -------------------------------------
>
>                 Key: HIVE-78
>                 URL: https://issues.apache.org/jira/browse/HIVE-78
>             Project: Hive
>          Issue Type: New Feature
>          Components: Metastore, Query Processor, Server Infrastructure
>            Reporter: Ashish Thusoo
>            Assignee: He Yongqiang
>         Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, 
> hive-78-syntax-v1.patch, hive-78.diff
>
>
> Allow hive to integrate with existing user repositories for authentication 
> and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-78) Authorization infrastructure for Hive

Reply via email to