Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 3:

(9 comments)

Thanks Quanlong for adding the support for row-filtering in Impala! Sorry that 
it took me quite a while to review this patch. I briefly reviewed the added 
end-to-end tests and they look good to me. I only have some very minor comments.

I plan to read/study the parts related to audit logs generation and will try to 
get back to you soon the coming week.

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java:

http://gerrit.cloudera.org:8080/#/c/16976/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java@56
PS1, Line 56:    * Stash and deduplicate the audit events produced by table 
masking (Column-masking /
> Thanks Quanlong! I will review this and try to get back to you later today.
Thanks Quanlong! Maybe we could also add here a statement like "Refer to 
IMPALA-9597 for further details" so that people interested in this method could 
refer to there. I do not have any other suggestion. :-)


http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java:

http://gerrit.cloudera.org:8080/#/c/16976/3/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java@56
PS3, Line 56:   * Stash and deduplicate the audit events produced by table 
masking (Column-masking /
            :    * Row-filtering) which are performed during the analyze phase. 
Called at the end of
            :    * analyzing. These stashed events will be added back after the 
query pass the
            :    * authorization phase. Note that normal events (select, 
insert, drop, etc.) are
            :    * produced in the authorization phase. Stashing table masking 
events avoids exposing
            :    * them when the query fails authorization.
Thanks Quanlong!

Maybe we could also add here a statement like "Refer to IMPALA-9597 for further 
details" so that people interested in this method could refer to there as well. 
I do not have any other suggestion. :-)


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
File 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test@16
PS3, Line 16: # Column-masking policies of functional.alltypes mask "id" to 
"-id" and redact column
            : # "date_string_col". Row-filtering policy of 
functional.alltypes_view keeps rows with
            : # "id >= -8 and date_string_col = 'nn/nn/nn'"
Thanks Quanlong for adding this test case!

Maybe it would be even clearer to emphasize here that the view 'alltypes_view' 
is based on the table 'alltypes' and thus the column masking policies were 
applied to 'alltypes' before the row-filter policies are applied to 
'alltypes_view'.


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
File 
testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@76
PS3, Line 76: support using subquery
Maybe it would be clearer to emphasize that the subquery is on the same table, 
i.e., 'functional', in order to contrast with the next test case.


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@129
PS3, Line 129: select * from functional_parquet.complextypestbl.nested_struct.b
Maybe it would be good to briefly mention here that what the query is to fetch 
the desired result set if there is no row-filtering policy involved.


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@136
PS3, Line 136: # The above query should be manually rewritten to this until 
IMPALA-10484 is resolved.
Thanks for adding this test case!


http://gerrit.cloudera.org:8080/#/c/16976/3/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@146
PS3, Line 146: ====
I was wondering if it would be good to add a test case involving 2 tables each 
associated with a row-filtering policy. For example, 'where id % 2 = 0' for the 
table 'functional.alltypestiny' and 'where id % 3 = 0' for the table 
'functional.alltypessmall'


http://gerrit.cloudera.org:8080/#/c/16976/3/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/16976/3/tests/authorization/test_ranger.py@1095
PS3, Line 1095:     user = getuser()
It may be good to add a comment here to point out that we do not need to 
additionally grant the SELECT privileges to 'getuser()' because 'getuser()' is 
the owner of the databases used in the test queries.

This test also shows that the a row-filtering policy takes precedence over the 
ownership of the resources involved in the policy. A user cannot access some 
rows in a table if there is some row-filtering policy applied to the table even 
though this user is the owner of the table. Is my understanding correct?


http://gerrit.cloudera.org:8080/#/c/16976/3/tests/authorization/test_ranger.py@1104
PS3, Line 1104: concat('0', '')
I am not very familiar with the creation of a row-filtering policy via REST API 
call. Is it true that simply using "string_col = '0'" does not work so that we 
have to use "concat('0', '')"?



--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fangyu....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Mar 2021 00:59:56 +0000
Gerrit-HasComments: Yes

Reply via email to