[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451558#comment-15451558
 ] 

Lili Ma commented on HAWQ-256:
------------------------------

[~thebellhead], quit good questions!

1. In order for tools, syntax checking, etc to work everyone (the HAWQ public 
role) requires access to the catalog and some of the toolkit. Will Ranger-only 
access control apply only to user created tables, views and external tables?
Yes, since the catalog tables and toolkits are shared and used by various 
users, Ranger-only access control just applies to user defined objects.  But 
the objects include not only database, table and view, but also include 
function, language, schema, tablespace and protocol. You can find the detailed 
objects and privileges in the design doc.

2. If so - will gpadmin and any other HAWQ-defined roles not have access to the 
data in Ranger managed tables?
Just as you mentioned, HAWQ uses gpadmin identity to create files on HDFS, say, 
when a specified userA creates a table in HAWQ, the HDFS files for the table 
are created by gpadmin instead of userA. Since Ranger lies in Hadoop 
eco-system, it usually needs to control both HAWQ and HDFS, I think we need 
assign gpadmin to the full privileges of hawq data file directory on HDFS in 
Ranger UI previously. 

About your concern about the superuser can see all the users' data, I think 
it's kind of like the "root" role in operation system?  If the users have 
concerns about the DBA/Superuser's unlimited access, I totally agree with you 
about the solution of "passing down user-identifiy" for solving this problem :)

3. How would this be extended for the hcatalog virtual database in HAWQ? Could 
the Ranger permissions for the underlying store (for instance Hive) be read and 
enforced/reported at the HAWQ level?
If HAWQ keeps the gpadmin for operating HDFS or external storage, I think we 
just need grant the privilege to superuser. But if we have implemented the 
user-identity passing down, say, the data files on HDFS for a table created by 
userA are owned by userA instead of gpadmin, in this way we need to double 
connect to Ranger, from HAWQ and HDFS respectively.  I haven't include the 
underlying store privileges check into HAWQ side, that may need multiple code 
changes. I think keeping the privileges in the component is another choice. 
Your thoughts?

Thanks
Lili


> Integrate Security with Apache Ranger
> -------------------------------------
>
>                 Key: HAWQ-256
>                 URL: https://issues.apache.org/jira/browse/HAWQ-256
>             Project: Apache HAWQ
>          Issue Type: New Feature
>          Components: PXF, Security
>            Reporter: Michael Andre Pearce (IG)
>            Assignee: Lili Ma
>             Fix For: backlog
>
>         Attachments: HAWQRangerSupportDesign.pdf, 
> HAWQRangerSupportDesign_v0.2.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to