Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 )
Change subject: IMPALA-9234: Support Ranger row filtering policies ...................................................................... Patch Set 2: (8 comments) http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java File fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java: http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java@175 PS2, Line 175: // TODO: Column-masking/Row-filtering expressions may have subqueries which may Can you file a JIRA for this? I guess it is a fairly plausible use case for this - I found an example here - https://cwiki.apache.org/confluence/display/RANGER/Row-level+filtering+and+column-masking+using+Apache+Ranger+policies+in+Apache+Hive I think we might also have to be careful about exactly what subqueries are allowed. E.g. if would be weird if you had a correlated references between the row filter subquery and the outer query, or if you could have a relative table reference that referred to different tables depending on which DB it was executed in. edit: found the JIRA IMPALA-10483 http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java: http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@473 PS2, Line 473: when row filter is disabled This is inaccurate - this method doesn't check whether row filtering is enabled. http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@476 PS2, Line 476: authorizeRowFilter Rename this method to reflect new behaviour, e.g. throwIfRowFilteringRequired() http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java: http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java@139 PS2, Line 139: // TODO: Whether we should use lowercase or uppercase accessType? How would we decide this? What does Hive do? http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java File fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java: http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@454 PS2, Line 454: String databaseName = "functional"; Add a brief comment describing the row filter policies added. http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@502 PS2, Line 502: authzOk(events -> { Can you comment why we get the row filter event in one case but not the other (I think it's because of the user it's applied to right). http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test File testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test: http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test@3 PS2, Line 3: # Row-filtering policy keeps rows with "id % 3 = 0". Is the expected behaviour that row filters are applied before column masks? http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test File testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test: http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@1 PS2, Line 1: ==== Can you add a test where the row filter references a column not in the table. it should fail with an analysis exception. There's an extra edge case to check here where it isn't in the table, but could be a correlated reference to an outer query. E.g. This is a valid query, but I think test_id = id would be an invalid row filter on alltypes. select * from jointbl where exists(select * from alltypes where test_id = id); -- To view, visit http://gerrit.cloudera.org:8080/16976 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc Gerrit-Change-Number: 16976 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Fang-Yu Rao <fangyu....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Fri, 12 Feb 2021 00:14:16 +0000 Gerrit-HasComments: Yes