[ https://issues.apache.org/jira/browse/IMPALA-9330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030348#comment-17030348 ]
ASF subversion and git services commented on IMPALA-9330: --------------------------------------------------------- Commit a17bea916f289599360796c45e8267a6338af6f8 in impala's branch refs/heads/master from stiga-huang [ https://gitbox.apache.org/repos/asf?p=impala.git;h=a17bea9 ] IMPALA-9330: Fix column masking in nested tables + enable column masking by default Column masking policies on primitive columns of a table which contains nested types (though they won't be masked) will cause query failures. To be specifit, if tableA(id int, int_array array<int>) has a masking policy on column "id", all queries on "tableA" will fail, e.g. select id from tableA; select t.id, a.item from tableA t, t.int_array a; Column masking is implemented by wrapping the underlying table/view with a table masking view. However, as we don't support nested types in SelectList, the table masking view can't expose nested columns of the masked table, which causes collection refs not being resolved correctly. This patch fixes the issue by 2 steps: 1) Expose nested columns of the underlying table in the output Type of the table masking view (see InlineViewRef#createTupleDescriptor()). So nested Paths in the original query block can be resolved. 2) For such kind of Paths, resolved them again inside the table masking view. So they can point to the underlying table as what they mean (see Analyzer#resolvePathWithMasking()). TupleDescriptor of such kind of table masking view won't be materialized since the view is simple enough that its query plan is just a ScanNode of the underlying table. The whole query plan can be stitched as if the table is not masked. Note that one day when we support nested columns in SelectList, we may don't need these 2 hacks. This patch also adds some TRACE level loggings to improve debuggability, and enables column masking by default. Test changes in TestRanger.test_column_masking: - Add column masking policy on a table containing nested types. - Add queries on the masked tables. Some queries are borrowed from existing tests for nested types. Tests: - Run CORE tests. Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791 Reviewed-on: http://gerrit.cloudera.org:8080/15108 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Support masking queries containing correlated collection references > ------------------------------------------------------------------- > > Key: IMPALA-9330 > URL: https://issues.apache.org/jira/browse/IMPALA-9330 > Project: IMPALA > Issue Type: Improvement > Reporter: Quanlong Huang > Assignee: Quanlong Huang > Priority: Critical > > Let's say table complextypestbl (id bigint, int_arr array<int>) has column > masking policy on id: "id => id * 100". The following query is not rewritten > correctly: > {code:sql} > select t.id, a.pos, a.item from complextypestbl t, t.int_arr a; > {code} > Because its AST will be rewritten to the AST of > {code:sql} > select t.id, a.pos, a.item from ( > select cast(id * 100 as BIGINT) id > from complextypestbl > ) t, t.int_arr a; > {code} > Currently, the analyzer can't resolve "t.int_arr" correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org