[ https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451558#comment-15451558 ]
Lili Ma commented on HAWQ-256: ------------------------------ [~thebellhead], quit good questions! 1. In order for tools, syntax checking, etc to work everyone (the HAWQ public role) requires access to the catalog and some of the toolkit. Will Ranger-only access control apply only to user created tables, views and external tables? Yes, since the catalog tables and toolkits are shared and used by various users, Ranger-only access control just applies to user defined objects. But the objects include not only database, table and view, but also include function, language, schema, tablespace and protocol. You can find the detailed objects and privileges in the design doc. 2. If so - will gpadmin and any other HAWQ-defined roles not have access to the data in Ranger managed tables? Just as you mentioned, HAWQ uses gpadmin identity to create files on HDFS, say, when a specified userA creates a table in HAWQ, the HDFS files for the table are created by gpadmin instead of userA. Since Ranger lies in Hadoop eco-system, it usually needs to control both HAWQ and HDFS, I think we need assign gpadmin to the full privileges of hawq data file directory on HDFS in Ranger UI previously. About your concern about the superuser can see all the users' data, I think it's kind of like the "root" role in operation system? If the users have concerns about the DBA/Superuser's unlimited access, I totally agree with you about the solution of "passing down user-identifiy" for solving this problem :) 3. How would this be extended for the hcatalog virtual database in HAWQ? Could the Ranger permissions for the underlying store (for instance Hive) be read and enforced/reported at the HAWQ level? If HAWQ keeps the gpadmin for operating HDFS or external storage, I think we just need grant the privilege to superuser. But if we have implemented the user-identity passing down, say, the data files on HDFS for a table created by userA are owned by userA instead of gpadmin, in this way we need to double connect to Ranger, from HAWQ and HDFS respectively. I haven't include the underlying store privileges check into HAWQ side, that may need multiple code changes. I think keeping the privileges in the component is another choice. Your thoughts? Thanks Lili > Integrate Security with Apache Ranger > ------------------------------------- > > Key: HAWQ-256 > URL: https://issues.apache.org/jira/browse/HAWQ-256 > Project: Apache HAWQ > Issue Type: New Feature > Components: PXF, Security > Reporter: Michael Andre Pearce (IG) > Assignee: Lili Ma > Fix For: backlog > > Attachments: HAWQRangerSupportDesign.pdf, > HAWQRangerSupportDesign_v0.2.pdf > > > Integrate security with Apache Ranger for a unified Hadoop security solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)