[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17199 ) Change subject: IMPALA-9661: Avoid introducing unused columns in table masking view .. Patch Set 3: (7 comments) Thanks for working on this. Sending comments based on first pass. http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java: http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@491 PS3, Line 491: no user-visible effect Is the 'no user-visible effect' accurate though ? The plan in the query profile will show the effects of the table masking isn't it ? http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@541 PS3, Line 541: reAnalyzeWithoutPrivChecks(stmtTableCache, authzCtx, origResultTypes, origColLabels); Shouldn't this only be called if reAnalyze = true ? I am also curious why this is being called again (line 506 has the previous invocation). http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@860 PS3, Line 860: column %s This says column but it seems the first parameter is resolvedTableRef's raw path ? http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1326 PS3, Line 1326: names nit: this is registering a Column object rather than just the name. http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1340 PS3, Line 1340: analyzer = analyzer.getParentAnalyzer(); Since getParentAnalyzer() can return null, we should break out of the loop if that happens. http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java: http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@171 PS3, Line 171: public boolean resolveTableMask(Analyzer analyzer) throws AnalysisException { Since this is not looking into the orderByElements_, I am wondering if it will miss the ORDER BY slot refs for table masking. http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@268 PS3, Line 268: public boolean resolveTableMask(Analyzer analyzer) throws AnalysisException { Shouldn't this also consider the groupByClause_, havingClause_, aggregate and analyticInfo ? Are the columns referenced by those getting captured in some other manner ? -- To view, visit http://gerrit.cloudera.org:8080/17199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 Gerrit-Change-Number: 17199 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 19 Mar 2021 06:31:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17199 ) Change subject: IMPALA-9661: Avoid introducing unused columns in table masking view .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8401/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 Gerrit-Change-Number: 17199 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 19 Mar 2021 03:00:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17199 ) Change subject: IMPALA-9661: Avoid introducing unused columns in table masking view .. Patch Set 3: Added more audit tests. -- To view, visit http://gerrit.cloudera.org:8080/17199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 Gerrit-Change-Number: 17199 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 19 Mar 2021 02:41:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17199 to look at the new patch set (#3). Change subject: IMPALA-9661: Avoid introducing unused columns in table masking view .. IMPALA-9661: Avoid introducing unused columns in table masking view Previously, if a table has column masking policies, we replace its unanalyzed TableRef with an analyzed InlineViewRef (table masking view) in FromClause.analyze(). However, we can't detect which columns are actually used in the original query at this point. In fact, analyze() for SelectList, WhereClause, GroupByClause and other clauses containing SlotRefs happen after FromClause.analyze(). After the whole query block is analyzed, we can get the exact set of required columns. This patch refactor the codes to do table masking after analyze() to avoid introducing unused columns. Referenced columns of a TableRef are registered in analyze(), which helps to figure out what columns are actually needed. With this, we don't need to revert table masking in FromClause.reset(). The doTableMasking flag in AST is also removed since now the table mask is resolved once after analyze(). Tests: - Run column masking and row filtering tests in test_ranger.py - Run FE audit tests Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 --- M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StmtNode.java M fe/src/main/java/org/apache/impala/analysis/Subquery.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/analysis/WithClause.java M fe/src/main/java/org/apache/impala/authorization/TableMask.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java 20 files changed, 335 insertions(+), 226 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17199/3 -- To view, visit http://gerrit.cloudera.org:8080/17199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 Gerrit-Change-Number: 17199 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17199 ) Change subject: IMPALA-9661: Avoid introducing unused columns in table masking view .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8400/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 Gerrit-Change-Number: 17199 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 19 Mar 2021 02:20:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17168 ) Change subject: IMPALA-10564: Return error when inserting an invalid decimal value .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8399/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Gerrit-Change-Number: 17168 Gerrit-PatchSet: 7 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 19 Mar 2021 02:12:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17168 ) Change subject: IMPALA-10564: Return error when inserting an invalid decimal value .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8398/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Gerrit-Change-Number: 17168 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 19 Mar 2021 02:03:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17199 ) Change subject: IMPALA-9661: Avoid introducing unused columns in table masking view .. Patch Set 2: (1 comment) Fix a test failure in AnalyzeStmtsTest. Slightly refactor some codes. http://gerrit.cloudera.org:8080/#/c/17199/1/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java File fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java: http://gerrit.cloudera.org:8080/#/c/17199/1/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@359 PS1, Line 359: assertEventEquals("@column", "select", "functional/alltypestiny/date_string_col", > line too long (92 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/17199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 Gerrit-Change-Number: 17199 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 19 Mar 2021 02:00:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17199 to look at the new patch set (#2). Change subject: IMPALA-9661: Avoid introducing unused columns in table masking view .. IMPALA-9661: Avoid introducing unused columns in table masking view Previously, if a table has column masking policies, we replace its unanalyzed TableRef with an analyzed InlineViewRef (table masking view) in FromClause.analyze(). However, we can't detect which columns are actually used in the original query at this point. In fact, analyze() for SelectList, WhereClause, GroupByClause and other clauses containing SlotRefs happen after FromClause.analyze(). After the whole query block is analyzed, we can get the exact set of required columns. This patch refactor the codes to do table masking after analyze() to avoid introducing unused columns. Referenced columns of a TableRef are registered in analyze(), which helps to figure out what columns are actually needed. With this, we don't need to revert table masking in FromClause.reset(). The doTableMasking flag in AST is also removed since now the table mask is resolved once after analyze(). Tests: - Run column masking and row filtering tests in test_ranger.py - Run FE audit tests Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 --- M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StmtNode.java M fe/src/main/java/org/apache/impala/analysis/Subquery.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/analysis/WithClause.java M fe/src/main/java/org/apache/impala/authorization/TableMask.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java 20 files changed, 270 insertions(+), 226 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17199/2 -- To view, visit http://gerrit.cloudera.org:8080/17199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 Gerrit-Change-Number: 17199 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value
Wenzhe Zhou has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/17168 ) Change subject: IMPALA-10564: Return error when inserting an invalid decimal value .. IMPALA-10564: Return error when inserting an invalid decimal value When using CTAS statements or INSERT-SELECT statements to insert rows to table with decimal columns, Impala insert NULL for overflowed decimal values, instead of returning error. This issue happens when the data expression for the decimal column in SELECT sub-query consists at least one alias. This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the issue for the cases with the data expression for the decimal columns as constants so that the overflowed decimal values could be detected by frontend during expression analysis. If there is an alias (variable) in the data expression for the decimal column, only backend could detect decimal overflow. This patch added a query option use_null_for_decimal_errors. When it is disabled, backend checks the query status of RuntimeState in Table Writer when ScalarExprEvaluator return NULL for decimal column. If there is an invalid deciaml error, the query will be failed without inserting NULL for decimal column. If use_null_for_decimal_errors is enabled, NULL will be inserted into table for invalid decimal value. We did not change the behaviour for decimal_v1. NULL will be inserted to the table for invalid decimal values with warning message. Tests: - Manually ran queries with overflowed decimal values by using CTAS and INSERT-SELECT statements. Verified that queries failed without inserting NULL as expected if use_null_for_decimal_errors was set as false, and NULLs were inserted into the table for overflowed decimal if use_null_for_decimal_errors was set as true. - Manually ran queries with overflowed decimal values and decimal_v2 set as false. The result is same as before - NULLs were inserted to table for invalid decimal values with warning message. - Added unit-tests for INSERT-SELECT and CTAS. - Passed core tests. Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 --- M be/src/common/status.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-text-table-writer.cc M be/src/exec/kudu-table-sink.cc M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exprs/decimal-operators-ir.cc M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/udf/udf.cc M be/src/udf/udf.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M common/thrift/generate_error_codes.py A testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test M tests/query_test/test_decimal_queries.py 16 files changed, 285 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/17168/7 -- To view, visit http://gerrit.cloudera.org:8080/17168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Gerrit-Change-Number: 17168 Gerrit-PatchSet: 7 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value
Wenzhe Zhou has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17168 ) Change subject: IMPALA-10564: Return error when inserting an invalid decimal value .. IMPALA-10564: Return error when inserting an invalid decimal value When using CTAS statements or INSERT-SELECT statements to insert rows to table with decimal columns, Impala insert NULL for overflowed decimal values, instead of returning error. This issue happens when the data expression for the decimal column in SELECT sub-query consists at least one alias. This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the issue for the cases with the data expression for the decimal columns as constants so that the overflowed decimal values could be detected by frontend during expression analysis. If there is an alias (variable) in the data expression for the decimal column, only backend could detect decimal overflow. This patch added a query option use_null_for_decimal_errors. When it is disabled, backend checks the query status of RuntimeState in Table Writer when ScalarExprEvaluator return NULL for decimal column. If there is an invalid deciaml error, the query will be failed without inserting NULL for decimal column. If use_null_for_decimal_errors is enabled, NULL will be inserted into table for invalid decimal value. We did not change the behaviour for decimal_v1. NULL will be inserted to the table for invalid decimal values with warning message. Tests: - Manually ran queries with overflowed decimal values by using CTAS and INSERT-SELECT statements. Verified that queries failed without inserting NULL as expected if use_null_for_decimal_errors was set as false, and NULLs were inserted into the table for overflowed decimal if use_null_for_decimal_errors was set as true. - Manually ran queries with overflowed decimal values and decimal_v2 set as false. The result is same as before - NULLs were inserted to table for invalid decimal values with warning message. - Added unit-tests for INSERT-SELECT and CTAS. - Passed core tests. Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 --- M be/src/common/status.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-text-table-writer.cc M be/src/exec/kudu-table-sink.cc M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exprs/decimal-operators-ir.cc M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/udf/udf.cc M be/src/udf/udf.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M common/thrift/generate_error_codes.py A testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test M tests/query_test/test_decimal_queries.py 16 files changed, 285 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/17168/6 -- To view, visit http://gerrit.cloudera.org:8080/17168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Gerrit-Change-Number: 17168 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17168 ) Change subject: IMPALA-10564: Return error when inserting an invalid decimal value .. Patch Set 5: Code-Review+2 LGTM. Let me know if you want anyone else to review. Otherwise, I can start a gerrit-verify run. -- To view, visit http://gerrit.cloudera.org:8080/17168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Gerrit-Change-Number: 17168 Gerrit-PatchSet: 5 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 19 Mar 2021 00:56:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17168 ) Change subject: IMPALA-10564: Return error when inserting an invalid decimal value .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8397/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Gerrit-Change-Number: 17168 Gerrit-PatchSet: 5 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 19 Mar 2021 00:33:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value
Wenzhe Zhou has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17168 ) Change subject: IMPALA-10564: Return error when inserting an invalid decimal value .. IMPALA-10564: Return error when inserting an invalid decimal value When using CTAS statements or INSERT-SELECT statements to insert rows to table with decimal columns, Impala insert NULL for overflowed decimal values, instead of returning error. This issue happens when the data expression for the decimal column in SELECT sub-query consists at least one alias. This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the issue for the cases with the data expression for the decimal columns as constants so that the overflowed decimal values could be detected by frontend during expression analysis. If there is an alias (variable) in the data expression for the decimal column, only backend could detect decimal overflow. This patch added a query option use_null_for_decimal_errors. When it is disabled, backend checks the query status of RuntimeState in Table Writer when ScalarExprEvaluator return NULL for decimal column. If there is an invalid deciaml error, the query will be failed without inserting NULL for decimal column. If use_null_for_decimal_errors is enabled, NULL will be inserted into table for invalid decimal value. We did not change the behaviour for decimal_v1. NULL will be inserted to the table for invalid decimal values with warning message. Tests: - Manually ran queries with overflowed decimal values by using CTAS and INSERT-SELECT statements. Verified that queries failed without inserting NULL as expected if use_null_for_decimal_errors was set as false, and NULLs were inserted into the table for overflowed decimal if use_null_for_decimal_errors was set as true. - Manually ran queries with overflowed decimal values and decimal_v2 set as false. The result is same as before - NULLs were inserted to table for invalid decimal values with warning message. - Added unit-tests for INSERT-SELECT and CTAS. - Passed core tests. Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 --- M be/src/common/status.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-text-table-writer.cc M be/src/exec/kudu-table-sink.cc M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exprs/decimal-operators-ir.cc M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/udf/udf.cc M be/src/udf/udf.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M common/thrift/generate_error_codes.py A testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test M tests/query_test/test_decimal_queries.py 16 files changed, 283 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/17168/5 -- To view, visit http://gerrit.cloudera.org:8080/17168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Gerrit-Change-Number: 17168 Gerrit-PatchSet: 5 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17185 ) Change subject: IMPALA-10483: Support subqueries in Ranger masking policies .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8396/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17185 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f Gerrit-Change-Number: 17185 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 18 Mar 2021 23:26:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17168 ) Change subject: IMPALA-10564: Return error when inserting an invalid decimal value .. Patch Set 4: Code-Review+1 (7 comments) Thanks for making the changes. A few mostly nits and one comment about the test. http://gerrit.cloudera.org:8080/#/c/17168/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17168/1//COMMIT_MSG@7 PS1, Line 7: lid decima > Done Done http://gerrit.cloudera.org:8080/#/c/17168/1/be/src/exec/hdfs-text-table-writer.cc File be/src/exec/hdfs-text-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/17168/1/be/src/exec/hdfs-text-table-writer.cc@107 PS1, Line 107: invalid de > Done Done http://gerrit.cloudera.org:8080/#/c/17168/4/be/src/exec/hdfs-text-table-writer.cc File be/src/exec/hdfs-text-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/17168/4/be/src/exec/hdfs-text-table-writer.cc@107 PS4, Line 107: // IMPALA-10564: For invalid decimal value, we should return an error nit: the comment indicates this is done unconditionally. Could you update it similar to other places that it is done based on the query option. http://gerrit.cloudera.org:8080/#/c/17168/4/be/src/exec/kudu-table-sink.cc File be/src/exec/kudu-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/17168/4/be/src/exec/kudu-table-sink.cc@253 PS4, Line 253: // IMPALA-10564: For invalid decimal value, we should return an error nit: same comment as above. http://gerrit.cloudera.org:8080/#/c/17168/4/be/src/exec/parquet/hdfs-parquet-table-writer.cc File be/src/exec/parquet/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/17168/4/be/src/exec/parquet/hdfs-parquet-table-writer.cc@698 PS4, Line 698: // IMPALA-10564: For invalid decimal value, we should return an error nit: same as above http://gerrit.cloudera.org:8080/#/c/17168/4/common/thrift/generate_error_codes.py File common/thrift/generate_error_codes.py: http://gerrit.cloudera.org:8080/#/c/17168/4/common/thrift/generate_error_codes.py@475 PS4, Line 475: type 'value' would be more accurate than 'type' to avoid confusion with decimal type precision and scale. http://gerrit.cloudera.org:8080/#/c/17168/4/testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test File testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test: http://gerrit.cloudera.org:8080/#/c/17168/4/testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test@43 PS4, Line 43: RESULTS This verifies if a row was inserted but Is there a way to check that the inserted value is NULL ? I was thinking of something like 'select count(*) from overflowed_decimal_tbl where is null' . Although, since you are using the same table for the insertions, the count will keep growing. -- To view, visit http://gerrit.cloudera.org:8080/17168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Gerrit-Change-Number: 17168 Gerrit-PatchSet: 4 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 18 Mar 2021 23:18:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17185 ) Change subject: IMPALA-10483: Support subqueries in Ranger masking policies .. Patch Set 3: (1 comment) > Patch Set 3: > > (1 comment) Forgot some changes in last commit. Uploaded them in PS4. http://gerrit.cloudera.org:8080/#/c/17185/3/tests/authorization/test_ranger.py File tests/authorization/test_ranger.py: http://gerrit.cloudera.org:8080/#/c/17185/3/tests/authorization/test_ranger.py@1270 PS3, Line 1270: > flake8: E251 unexpected spaces around keyword / parameter equals Done -- To view, visit http://gerrit.cloudera.org:8080/17185 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f Gerrit-Change-Number: 17185 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 18 Mar 2021 23:06:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies
Hello Aman Sinha, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17185 to look at the new patch set (#4). Change subject: IMPALA-10483: Support subqueries in Ranger masking policies .. IMPALA-10483: Support subqueries in Ranger masking policies This patch adds support for using subqueries in Ranger masking policies, i.e. column-masking/row-filtering policies. The subquery can reference either the current table or other tables. However, masking policies on these tables won't be applied recursively. This is consistent with Hive. One motivation is to avoid infinitely masking if it references the same table. Another motivation I think is to simplify the masking behavior, so when the admin is setting a masking expression, it can be considered as running in the admin's perspective (i.e. no masking). Implementation Before analyzing the query, the coordinator loads the metadata of all possibly used tables into the query's StmtTableCache. Table masking takes place after the analyzing phase. If the subquery filter introduces any new tables, the analyzer will fail to resolve them since their metadata is not loaded in the StmtTableCache. This patch modified the StmtMetadataLoader to also load those tables introduced by masking policies. So they can be resolved correctly. Tests - Add more complex tests in test_row_filtering Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f --- M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/authorization/TableMask.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java M fe/src/main/java/org/apache/impala/service/Frontend.java M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test M testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test M tests/authorization/test_ranger.py 8 files changed, 295 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/17185/4 -- To view, visit http://gerrit.cloudera.org:8080/17185 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f Gerrit-Change-Number: 17185 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17168 ) Change subject: IMPALA-10564: Return error when inserting an invalid decimal value .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8395/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Gerrit-Change-Number: 17168 Gerrit-PatchSet: 4 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 18 Mar 2021 22:17:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value
Wenzhe Zhou has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17168 ) Change subject: IMPALA-10564: Return error when inserting an invalid decimal value .. IMPALA-10564: Return error when inserting an invalid decimal value When using CTAS statements or INSERT-SELECT statements to insert rows to table with decimal columns, Impala insert NULL for overflowed decimal values, instead of returning error. This issue happens when the data expression for the decimal column in SELECT sub-query consists at least one alias. This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the issue for the cases with the data expression for the decimal columns as constants so that the overflowed decimal values could be detected by frontend during expression analysis. If there is an alias (variable) in the data expression for the decimal column, only backend could detect decimal overflow. This patch added a query option use_null_for_decimal_errors. When it is disabled, backend checks the query status of RuntimeState in Table Writer when ScalarExprEvaluator return NULL for decimal column. If there is an invalid deciaml error, the query will be failed without inserting NULL for decimal column. If use_null_for_decimal_errors is enabled, NULL will be inserted into table for invalid decimal value. We did not change the behaviour for decimal_v1. NULL will be inserted to the table for invalid decimal values with warning message. Tests: - Manually ran queries with overflowed decimal values by using CTAS and INSERT-SELECT statements. Verified that queries failed without inserting NULL as expected if use_null_for_decimal_errors was set as false, and NULLs were inserted into the table for overflowed decimal if use_null_for_decimal_errors was set as true. - Manually ran queries with overflowed decimal values and decimal_v2 set as false. The result is same as before - NULLs were inserted to table for invalid decimal values with warning message. - Added unit-tests for INSERT-SELECT and CTAS. - Passed core tests. Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 --- M be/src/common/status.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-text-table-writer.cc M be/src/exec/kudu-table-sink.cc M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exprs/decimal-operators-ir.cc M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/udf/udf.cc M be/src/udf/udf.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M common/thrift/generate_error_codes.py A testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test M tests/query_test/test_decimal_queries.py 16 files changed, 272 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/17168/4 -- To view, visit http://gerrit.cloudera.org:8080/17168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Gerrit-Change-Number: 17168 Gerrit-PatchSet: 4 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10552: Support external frontends supplying timeline for profile
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/17183 ) Change subject: IMPALA-10552: Support external frontends supplying timeline for profile .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/17183/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/17183/1/be/src/service/impala-server.cc@1212 PS1, Line 1212: (*query_handle)->set_user_profile_access(result.user_has_profile_access); I think this line got duplicated, maybe a merge conflict resolution error? -- To view, visit http://gerrit.cloudera.org:8080/17183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2b3692b4118ea23c0f9f8ec4bcc27b0b68bb32ec Gerrit-Change-Number: 17183 Gerrit-PatchSet: 1 Gerrit-Owner: John Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Thu, 18 Mar 2021 21:46:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16976 ) Change subject: IMPALA-9234: Support Ranger row filtering policies .. IMPALA-9234: Support Ranger row filtering policies Ranger row filtering policies provide customized expressions to filter out rows for specific users when reading from a table. This patch adds support for this feature. A new feature flag, enable_row_filtering, is added to disable this experimental feature. It defaults to be true so the feature is enabled by default. Enabling row-filtering requires --enable_column_masking=true since it depends on the column masking implementation. Note that row filtering policies take effects prior to any column masking policies, because column masking policies apply on result data. Implementation: The existing table masking view infrastructure can be extended to support row filtering. Currently when analyzing a table with column masking policies, we replace the TableRef with an InlineViewRef which contains a SelectStmt wrapping the columns with masking expressions. This patch adds the row filtering expressions to the WhereClause of the SelectStmt. Limitations: - Expressions using subqueries are not supported (IMPALA-10483). - Row filtering policies on nested tables will not be applied when nested collection columns are used directly in the FROM clause. This will leak data so we forbid such kinds of queries until IMPALA-10484 is resolved. Tests: - Add FE test for error message when disabling row filtering. - Add e2e test with row filtering policies. - Add e2e test with column masking and row filtering policies both take place. - Verified audits in a CDP cluster with Ranger and Solr set up. Change-Id: I580517be241225ca15e45686381b78890178d7cc Reviewed-on: http://gerrit.cloudera.org:8080/16976 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/TableMask.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/AuthorizationUtil.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test A testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test M tests/authorization/test_ranger.py 23 files changed, 1,005 insertions(+), 113 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16976 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc Gerrit-Change-Number: 16976 Gerrit-PatchSet: 13 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 ) Change subject: IMPALA-9234: Support Ranger row filtering policies .. Patch Set 12: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16976 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc Gerrit-Change-Number: 16976 Gerrit-PatchSet: 12 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 18 Mar 2021 21:08:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10494: Making use of the min/max column stats to improve min/max filters
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17075 ) Change subject: IMPALA-10494: Making use of the min/max column stats to improve min/max filters .. Patch Set 22: Aman asked whether min/max filters are applied in the context of out-joins. The answer is no. Please refer to TPCDS q49. -- To view, visit http://gerrit.cloudera.org:8080/17075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I08581b44419bb8da5940cbf98502132acd1c86df Gerrit-Change-Number: 17075 Gerrit-PatchSet: 22 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 18 Mar 2021 20:27:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10494: Making use of the min/max column stats to improve min/max filters
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17075 ) Change subject: IMPALA-10494: Making use of the min/max column stats to improve min/max filters .. Patch Set 22: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8394/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I08581b44419bb8da5940cbf98502132acd1c86df Gerrit-Change-Number: 17075 Gerrit-PatchSet: 22 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 18 Mar 2021 20:17:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10494: Making use of the min/max column stats to improve min/max filters
Qifan Chen has uploaded a new patch set (#22). ( http://gerrit.cloudera.org:8080/17075 ) Change subject: IMPALA-10494: Making use of the min/max column stats to improve min/max filters .. IMPALA-10494: Making use of the min/max column stats to improve min/max filters This patch adds the functionality to compute the minimal and the maximal value for a column of type integers, float or double for parquet tables, and to make use of the new stats to discard min/max filters, in both hash join builders and Parquet scanners, whose coverage are too close to the actual range defined by the column min and max. The computation and dislay of the new column min/max stats are done for Parquet tables only and can be controlled by two new Boolean query options (default to false): 1. compute_column_minmax_stats 2. show_column_minmax_stats Usage examples. set compute_column_minmax_stats=true; compute stats tpcds_parquet.store_sales; set show_column_minmax_stats=true; show column stats tpcds_parquet.store_sales; +---+--+-...---+-+-+ | Column| Type | #Falses | Min | Max | +---+--+-...---+-+-+ | ss_sold_time_sk | INT | -1 | 28800 | 75599 | | ss_item_sk| BIGINT | -1 | 1 | 18000 | | ss_customer_sk| INT | -1 | 1 | 10 | | ss_cdemo_sk | INT | -1 | 15 | 1920797 | | ss_hdemo_sk | INT | -1 | 1 | 7200| | ss_addr_sk| INT | -1 | 1 | 5 | | ss_store_sk | INT | -1 | 1 | 10 | | ss_promo_sk | INT | -1 | 1 | 300 | | ss_ticket_number | BIGINT | -1 | 1 | 24 | | ss_quantity | INT | -1 | 1 | 100 | | ss_wholesale_cost | DECIMAL(7,2) | -1 | -1 | -1 | | ss_list_price | DECIMAL(7,2) | -1 | -1 | -1 | | ss_sales_price| DECIMAL(7,2) | -1 | -1 | -1 | | ss_ext_discount_amt | DECIMAL(7,2) | -1 | -1 | -1 | | ss_ext_sales_price| DECIMAL(7,2) | -1 | -1 | -1 | | ss_ext_wholesale_cost | DECIMAL(7,2) | -1 | -1 | -1 | | ss_ext_list_price | DECIMAL(7,2) | -1 | -1 | -1 | | ss_ext_tax| DECIMAL(7,2) | -1 | -1 | -1 | | ss_coupon_amt | DECIMAL(7,2) | -1 | -1 | -1 | | ss_net_paid | DECIMAL(7,2) | -1 | -1 | -1 | | ss_net_paid_inc_tax | DECIMAL(7,2) | -1 | -1 | -1 | | ss_net_profit | DECIMAL(7,2) | -1 | -1 | -1 | | ss_sold_date_sk | INT | -1 | 2450816 | 2452642 | +---+--+-...---+-+-+ Only the min/max values for non-partition columns are stored in HMS. The min/max values for partition columns are computed in coordinator. The min-max filters, in C++ class or protobuf form, are augmented to deal with the always true state better. Once always true is set, the actual min and max values in the filter are no longer populated. Testing: - Added new compute/show stats tests for integers, float and double column data types in compute-stats-column-minmax.test; - Added new tests in overlap_min_max_filters.test to demonstrate the usefulness of column stats to quickly disable useless filters in both hash join builder and Parquet scanner; - Added tests in min-max-filter-test.cc to demonstrate method Or(), ToProtobuf() and constructor can deal with always true flag well; - core tests. TODO: 1. Test compute stats for timestamp and date columns; 2. Enable the feature for Iceberg tables with Parquet data files. Change-Id: I08581b44419bb8da5940cbf98502132acd1c86df --- M be/src/exec/catalog-op-executor.cc M be/src/exec/filter-context.cc M be/src/exec/filter-context.h M be/src/exec/hdfs-scanner.h M be/src/exec/incr-stats-util-test.cc M be/src/exec/incr-stats-util.cc M be/src/exec/incr-stats-util.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/service/hs2-util.cc M be/src/service/hs2-util.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/min-max-filter-test.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h M common/thrift/CatalogObjects.thrift M common/thrift/Frontend.thrift M common/thrift/ImpalaService.thrift M common/thrift/PlanNodes.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apac
[Impala-ASF-CR] IMPALA-9470: Use Parquet Bloom filters - Part 1
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/17026 ) Change subject: IMPALA-9470: Use Parquet Bloom filters - Part 1 .. Patch Set 17: (25 comments) http://gerrit.cloudera.org:8080/#/c/17026/17//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17026/17//COMMIT_MSG@7 PS17, Line 7: Part 1 Can you add some info about what is expected in later parts? http://gerrit.cloudera.org:8080/#/c/17026/17//COMMIT_MSG@26 PS17, Line 26: Testing: It would be great to add some unit tests for ParquetBloomFilter, especially if there are paths that are not used in the EE test. http://gerrit.cloudera.org:8080/#/c/17026/17//COMMIT_MSG@28 PS17, Line 28: Parquet Bloom filtering works for the supported types and that we do Please mention that we use a Parquet file generated by some other tool. This info should be also added to https://github.com/apache/impala/blob/aeeff53e884a67ee7f5980654a1d394c6e3e34ac/testdata/data/README http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.h File be/src/exec/parquet/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.h@529 PS17, Line 529: buffer_pool_client_ BufferPool::ClientHandle should be only used from a single thread: https://github.com/apache/impala/blob/master/be/src/runtime/bufferpool/buffer-pool.h#L332 I think that can cause problems if there are multiple scanners for a single scan node. http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.h@801 PS17, Line 801: static bool IsParquetBloomFilterSupported(parquet::Type::type parquet_type, There may be better places for these functions, e.g. ParquetMetadataUtils, ParquetCommon, or probably a separate file for Parquet bloom filter related stuff. http://gerrit.cloudera.org:8080/#/c/17026/3/be/src/exec/parquet/hdfs-parquet-scanner.h File be/src/exec/parquet/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/17026/3/be/src/exec/parquet/hdfs-parquet-scanner.h@704 PS3, Line 704: /// It could be noted that this is read from metadata_range_. http://gerrit.cloudera.org:8080/#/c/17026/3/be/src/exec/parquet/hdfs-parquet-scanner.h@718 PS3, Line 718: /// Decides how to divide stream_->reservation() between the columns. May increase consistency: EvalDictionaryFilters uses skip_row_group for the same purpose. http://gerrit.cloudera.org:8080/#/c/17026/3/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17026/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@1108 PS3, Line 1108: nst string& fn_ A few DCHECKs would be nice, e.g. to ensure that metadata_range_ is filled. http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@843 PS17, Line 843: continue; This means that we will skip the row group without raising an y counters. I think that we should process the row group if there are issues with the bloomfilter. http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1470 PS17, Line 1470: FindChildSlotRef Is this needed for any supported types? e.g. char(N) in the example shouldn't be supported. http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1684 PS17, Line 1684: const int8_t* const cast_value = reinterpret_cast(value); : const int byte_len = ParquetPlainEncoder::Encode(*cast_value, : -1 /* fixed_len_size */, storage->data()); : DCHECK_EQ(byte_len, output_len); Create a template function for this code? http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1729 PS17, Line 1729: const int exp_size = ParquetPlainEncoder::ByteSize(*cast_value); : storage->resize(exp_size); if this was moved out, then the same template function could be used as the one mentioned at line 1684 http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1825 PS17, Line 1825: __isset.meta_data We shouldn't need to check this here. http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1834 PS17, Line 1834: &header_size, &bloom_filter_header)); nit: too much indentation http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1842 PS17, Line 1842: return Status(Substitute("Could not allocate buffer of $0 bytes for Parquet " nit: too much indentation http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1857 PS17, Line 1857: data_buffer.buffer() + data_alr
[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17081 ) Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables .. IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables ALTER TABLE ADD PARTITION should bump the write id for ACID tables. Both for INSERT-only and full ACID tables. For transational tables we are adding partitions in an ACID transaction in the following sequence: 1. open transaction 2. allocate write id for table 3. add partitions to HMS table 4. commit transaction However, please note that table metadata modifications are independent of ACID transactions. I.e. if add partitions succeed, but we cannot commit the transaction, then we the newly added partitions won't get removed. So why are we opening a txn then? We are doing it in order to bump the write id in a best-effort way. This aids table metadata caching, so by looking at the table write id we can determine if the cached table metadata is up-to-date. Testing: * added e2e test Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd Reviewed-on: http://gerrit.cloudera.org:8080/17081 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test M tests/query_test/test_acid.py 7 files changed, 133 insertions(+), 41 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd Gerrit-Change-Number: 17081 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17081 ) Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd Gerrit-Change-Number: 17081 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 18 Mar 2021 19:35:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10590: Introduce admission service heartbeat mechanism
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/17194 ) Change subject: IMPALA-10590: Introduce admission service heartbeat mechanism .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-control-service.cc File be/src/scheduling/admission-control-service.cc: http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-control-service.cc@265 PS1, Line 265: AdmissionHeartbeat what happens when a coord dies? does the AC service detect that and remove all queries for that host? if yes, we should also remove the entry for that coord in running_queries_ otherwise we might end up with empty map entries whenever a coord restarts. http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-controller.h@835 PS1, Line 835: Map from host id to maps from nit: Map from host id to a map of query id http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-controller.cc@1361 PS1, Line 1361: / In the context of the admission control service, this may happen, eg. if a : // ReleaseQuery rpc is reported as failed to the coordinator but actually ends up : // arriving much later, so only log at WARNING level. can you document the remote client's behavior for failed RPCs in its class comment http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-controller.cc@1364 PS1, Line 1364: LOG(WARNING) << "Unable to find resources to release for query " nit: maybe add to the log message that it might have already been released. http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-controller.cc@1397 PS1, Line 1397: LOG(DFATAL) should this be a warning now too? -- To view, visit http://gerrit.cloudera.org:8080/17194 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia528d92268cea487ada20b476935a81166f5ad34 Gerrit-Change-Number: 17194 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 18 Mar 2021 16:34:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17186 ) Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8393/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01 Gerrit-Change-Number: 17186 Gerrit-PatchSet: 2 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 18 Mar 2021 16:12:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function
Fucun Chu has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17186 Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function .. IMPALA-10581: Implement ds_theta_intersect_f() function This function receives two strings that are serialized Apache DataSketches Theta sketches. Computes the intersection of two sketches of same or different column and returns the resulting sketch of intersection. Example: select ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2)) from sketch_tbl; +---+ | ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2)) | +---+ | 5 | +---+ Change-Id: I335eada00730036d5433775cfe673e0e4babaa01 --- M be/src/exprs/datasketches-functions-ir.cc M be/src/exprs/datasketches-functions.h M common/function-registry/impala_functions.py M testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test 4 files changed, 119 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/17186/2 -- To view, visit http://gerrit.cloudera.org:8080/17186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01 Gerrit-Change-Number: 17186 Gerrit-PatchSet: 2 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10580: Implement ds theta union f() function
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17179 ) Change subject: IMPALA-10580: Implement ds_theta_union_f() function .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8392/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8329979b81ceeaad739a43fab79768ca9c2916fa Gerrit-Change-Number: 17179 Gerrit-PatchSet: 3 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 18 Mar 2021 15:27:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 ) Change subject: IMPALA-9234: Support Ranger row filtering policies .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6982/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16976 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc Gerrit-Change-Number: 16976 Gerrit-PatchSet: 12 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 18 Mar 2021 15:26:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 ) Change subject: IMPALA-9234: Support Ranger row filtering policies .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16976 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc Gerrit-Change-Number: 16976 Gerrit-PatchSet: 12 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 18 Mar 2021 15:26:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 ) Change subject: IMPALA-9234: Support Ranger row filtering policies .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16976 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc Gerrit-Change-Number: 16976 Gerrit-PatchSet: 11 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 18 Mar 2021 15:25:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10580: Implement ds theta union f() function
Fucun Chu has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17179 ) Change subject: IMPALA-10580: Implement ds_theta_union_f() function .. IMPALA-10580: Implement ds_theta_union_f() function This function receives two strings that are serialized Apache DataSketches Theta sketches. Union two sketches and returns the resulting sketch of union. Example: select ds_theta_estimate(ds_theta_union_f(sketch1, sketch2)) from sketch_tbl; +---+ | ds_theta_estimate(ds_theta_union_f(sketch1, sketch2)) | +---+ | 15| +---+ Change-Id: I8329979b81ceeaad739a43fab79768ca9c2916fa --- M be/src/exprs/datasketches-functions-ir.cc M be/src/exprs/datasketches-functions.h M common/function-registry/impala_functions.py M testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test 4 files changed, 111 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/17179/3 -- To view, visit http://gerrit.cloudera.org:8080/17179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8329979b81ceeaad739a43fab79768ca9c2916fa Gerrit-Change-Number: 17179 Gerrit-PatchSet: 3 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10593: Skip runtime filter for outer joins...
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17200 ) Change subject: IMPALA-10593: Skip runtime filter for outer joins... .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8391/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17200 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46462e2030731d97c4c88e364148c0093c025ab3 Gerrit-Change-Number: 17200 Gerrit-PatchSet: 1 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 18 Mar 2021 14:43:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10593: Skip runtime filter for outer joins...
Steve Carlin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17200 Change subject: IMPALA-10593: Skip runtime filter for outer joins... .. IMPALA-10593: Skip runtime filter for outer joins... ...when Expr not constant after null substitution. Currently there is code that asserts that an Expr is not constant after substituting SlotRefs with constant nulls. A third party tool needs this restriction to be weakened. In a case where an Expr is checked and the Expr is not constant even after substituting nulls, the result will be to not generate a runtime filter for that Expr. Change-Id: I46462e2030731d97c4c88e364148c0093c025ab3 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java 2 files changed, 8 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/17200/1 -- To view, visit http://gerrit.cloudera.org:8080/17200 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I46462e2030731d97c4c88e364148c0093c025ab3 Gerrit-Change-Number: 17200 Gerrit-PatchSet: 1 Gerrit-Owner: Steve Carlin
[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17081 ) Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8390/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd Gerrit-Change-Number: 17081 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 18 Mar 2021 14:10:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17185 ) Change subject: IMPALA-10483: Support subqueries in Ranger masking policies .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/8389/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/17185 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f Gerrit-Change-Number: 17185 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 18 Mar 2021 14:01:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17081 ) Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd Gerrit-Change-Number: 17081 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 18 Mar 2021 13:52:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17081 ) Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6981/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd Gerrit-Change-Number: 17081 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 18 Mar 2021 13:52:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17081 ) Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables .. Patch Set 6: Code-Review+2 We use ADD PARTITION when creating 'alltypestiny', and we check the write id in full-acid-rowid.test. Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/17081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd Gerrit-Change-Number: 17081 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 18 Mar 2021 13:51:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10483(part-1): Refactor table mask resolving
Quanlong Huang has abandoned this change. ( http://gerrit.cloudera.org:8080/17184 ) Change subject: IMPALA-10483(part-1): Refactor table mask resolving .. Abandoned Abandon since we have a better solution: https://gerrit.cloudera.org/c/17199 -- To view, visit http://gerrit.cloudera.org:8080/17184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Ia191928fb179b0b0632235c1fff4c18647e5802f Gerrit-Change-Number: 17184 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17185 ) Change subject: IMPALA-10483: Support subqueries in Ranger masking policies .. Patch Set 3: (3 comments) Rebased the patch to base on https://gerrit.cloudera.org/c/17199 http://gerrit.cloudera.org:8080/#/c/17185/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17185/2//COMMIT_MSG@7 PS2, Line 7: IMPALA-10483: Support subqueries in Ranger masking policies > I think the code changes in the patch are straightforward. Regarding testi I think COMPUTE STATS should be blocked since it required ALTER privilege (same as the issue in IMPALA-10554). The target user can only SELECT the table. Let's deal with such issues in IMPALA-10554 together. http://gerrit.cloudera.org:8080/#/c/17185/2/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test File testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test: http://gerrit.cloudera.org:8080/#/c/17185/2/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@167 PS2, Line 167: INT,BOOLEAN,STRING > A few questions/comments: The row filter can have any expressions as long as they are correct in syntax and semantic. Would you won't more complex row filters in tests? http://gerrit.cloudera.org:8080/#/c/17185/2/tests/authorization/test_ranger.py File tests/authorization/test_ranger.py: http://gerrit.cloudera.org:8080/#/c/17185/2/tests/authorization/test_ranger.py@1232 PS2, Line 1232: admin_client.execute("grant select on database tpch to user %s" % user) > In this row filter would a correlation condition contained entirely within > the row filter be ok ? e.g ..select n_nationkey from nation n1 where n_name in (select n_name from nation n2 where n1.n_regionkey = n2.n_regionkey). Yeah, it should work. But this filter don't contain 'current_user()' so the policy will have the same effects for all users. Let me try to add a similar test. > could we also add a negative test where the correlation is to a table in the > parent query, not in the row filter itself. That one is expected to fail. > (Maybe you already have this test .. if so, feel free to ignore). Yeah, I think the tests in ranger_row_filtering.test about 'test_id' satisfy these. -- To view, visit http://gerrit.cloudera.org:8080/17185 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f Gerrit-Change-Number: 17185 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 18 Mar 2021 13:50:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17185 ) Change subject: IMPALA-10483: Support subqueries in Ranger masking policies .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/17185/3/tests/authorization/test_ranger.py File tests/authorization/test_ranger.py: http://gerrit.cloudera.org:8080/#/c/17185/3/tests/authorization/test_ranger.py@1270 PS3, Line 1270: flake8: E251 unexpected spaces around keyword / parameter equals -- To view, visit http://gerrit.cloudera.org:8080/17185 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f Gerrit-Change-Number: 17185 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 18 Mar 2021 13:50:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables
Hello Vihang Karajgaonkar, Gabor Kaszab, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17081 to look at the new patch set (#6). Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables .. IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables ALTER TABLE ADD PARTITION should bump the write id for ACID tables. Both for INSERT-only and full ACID tables. For transational tables we are adding partitions in an ACID transaction in the following sequence: 1. open transaction 2. allocate write id for table 3. add partitions to HMS table 4. commit transaction However, please note that table metadata modifications are independent of ACID transactions. I.e. if add partitions succeed, but we cannot commit the transaction, then we the newly added partitions won't get removed. So why are we opening a txn then? We are doing it in order to bump the write id in a best-effort way. This aids table metadata caching, so by looking at the table write id we can determine if the cached table metadata is up-to-date. Testing: * added e2e test Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd --- M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test M tests/query_test/test_acid.py 7 files changed, 133 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/17081/6 -- To view, visit http://gerrit.cloudera.org:8080/17081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd Gerrit-Change-Number: 17081 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies
Hello Aman Sinha, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17185 to look at the new patch set (#3). Change subject: IMPALA-10483: Support subqueries in Ranger masking policies .. IMPALA-10483: Support subqueries in Ranger masking policies This patch adds support for using subqueries in Ranger masking policies, i.e. column-masking/row-filtering policies. The subquery can reference either the current table or other tables. However, masking policies on these tables won't be applied recursively. This is consistent with Hive. One motivation is to avoid infinitely masking if it references the same table. Another motivation I think is to simplify the masking behavior, so when the admin is setting a masking expression, it can be considered as running in the admin's perspective (i.e. no masking). Implementation Before analyzing the query, the coordinator loads the metadata of all possibly used tables into the query's StmtTableCache. Table masking takes place after the analyzing phase. If the subquery filter introduces any new tables, the analyzer will fail to resolve them since their metadata is not loaded in the StmtTableCache. This patch modified the StmtMetadataLoader to also load those tables introduced by masking policies. So they can be resolved correctly. Tests - Add more complex tests in test_row_filtering Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f --- M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/authorization/TableMask.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java M fe/src/main/java/org/apache/impala/service/Frontend.java M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test M testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test M tests/authorization/test_ranger.py 8 files changed, 298 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/17185/3 -- To view, visit http://gerrit.cloudera.org:8080/17185 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f Gerrit-Change-Number: 17185 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17199 ) Change subject: IMPALA-9661: Avoid introducing unused columns in table masking view .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8388/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 Gerrit-Change-Number: 17199 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 18 Mar 2021 13:46:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17199 ) Change subject: IMPALA-9661: Avoid introducing unused columns in table masking view .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/17199/1/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java File fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java: http://gerrit.cloudera.org:8080/#/c/17199/1/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@359 PS1, Line 359: assertEventEquals("@column", "select", "functional/alltypestiny/date_string_col", 1, line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 Gerrit-Change-Number: 17199 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 18 Mar 2021 13:27:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view
Quanlong Huang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17199 Change subject: IMPALA-9661: Avoid introducing unused columns in table masking view .. IMPALA-9661: Avoid introducing unused columns in table masking view Previously, if a table has column masking policies, we replace its unanalyzed TableRef with an analyzed InlineViewRef (table masking view) in FromClause.analyze(). However, we can't detect which columns are actually used in the original query at this point. In fact, analyze() for SelectList, WhereClause, GroupByClause and other clauses containing SlotRefs happen after FromClause.analyze(). After the whole query block is analyzed, we can get the exact set of required columns. This patch refactor the codes to do table masking after analyze() to avoid introducing unused columns. Referenced columns of a TableRef are registered in analyze(), which helps to figure out what columns are actually needed. Tests: - Run column masking and row filtering tests in test_ranger.py - Run FE audit tests Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 --- M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StmtNode.java M fe/src/main/java/org/apache/impala/analysis/Subquery.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/analysis/WithClause.java M fe/src/main/java/org/apache/impala/authorization/TableMask.java M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java 20 files changed, 267 insertions(+), 226 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17199/1 -- To view, visit http://gerrit.cloudera.org:8080/17199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 Gerrit-Change-Number: 17199 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang
[Impala-ASF-CR] Revert "IMPALA-10503: testdata load hits hive memory limit errors during hive inserts"
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17191 ) Change subject: Revert "IMPALA-10503: testdata load hits hive memory limit errors during hive inserts" .. Revert "IMPALA-10503: testdata load hits hive memory limit errors during hive inserts" This reverts commit c60a626ac66cc7cf24080b7ea84166c70bad9b22. Change-Id: I896c7b2457d537fa1bfe8dc29063da0b7b3df199 Reviewed-on: http://gerrit.cloudera.org:8080/17191 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/test/resources/hive-site.xml.py M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test 2 files changed, 137 insertions(+), 138 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I896c7b2457d537fa1bfe8dc29063da0b7b3df199 Gerrit-Change-Number: 17191 Gerrit-PatchSet: 3 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] Revert "IMPALA-10503: testdata load hits hive memory limit errors during hive inserts"
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17191 ) Change subject: Revert "IMPALA-10503: testdata load hits hive memory limit errors during hive inserts" .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I896c7b2457d537fa1bfe8dc29063da0b7b3df199 Gerrit-Change-Number: 17191 Gerrit-PatchSet: 2 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 18 Mar 2021 07:24:30 + Gerrit-HasComments: No