[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view

2021-03-18 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17199 )

Change subject: IMPALA-9661: Avoid introducing unused columns in table masking 
view
..


Patch Set 3:

(7 comments)

Thanks for working on this. Sending comments based on first pass.

http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@491
PS3, Line 491: no user-visible effect
Is the 'no user-visible effect' accurate though ? The plan in the query profile 
will show the effects of the table masking isn't it ?


http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@541
PS3, Line 541: reAnalyzeWithoutPrivChecks(stmtTableCache, authzCtx, 
origResultTypes, origColLabels);
Shouldn't this only be called if reAnalyze = true ?  I am also curious why this 
is being called again (line 506 has the previous invocation).


http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@860
PS3, Line 860: column %s
This says column but it seems the first parameter is resolvedTableRef's raw 
path ?


http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1326
PS3, Line 1326: names
nit: this is registering a Column object rather than just the name.


http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1340
PS3, Line 1340:   analyzer = analyzer.getParentAnalyzer();
Since getParentAnalyzer() can return null, we should break out of the loop if 
that happens.


http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@171
PS3, Line 171:   public boolean resolveTableMask(Analyzer analyzer) throws 
AnalysisException {
Since this is not looking into the orderByElements_, I am wondering if it will 
miss the ORDER BY slot refs for table masking.


http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@268
PS3, Line 268:   public boolean resolveTableMask(Analyzer analyzer) throws 
AnalysisException {
Shouldn't this also consider the groupByClause_, havingClause_, aggregate and 
analyticInfo ? Are the columns referenced by those getting captured in some 
other manner ?



--
To view, visit http://gerrit.cloudera.org:8080/17199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Gerrit-Change-Number: 17199
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 19 Mar 2021 06:31:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17199 )

Change subject: IMPALA-9661: Avoid introducing unused columns in table masking 
view
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8401/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Gerrit-Change-Number: 17199
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 19 Mar 2021 03:00:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view

2021-03-18 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17199 )

Change subject: IMPALA-9661: Avoid introducing unused columns in table masking 
view
..


Patch Set 3:

Added more audit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Gerrit-Change-Number: 17199
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 19 Mar 2021 02:41:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view

2021-03-18 Thread Quanlong Huang (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17199

to look at the new patch set (#3).

Change subject: IMPALA-9661: Avoid introducing unused columns in table masking 
view
..

IMPALA-9661: Avoid introducing unused columns in table masking view

Previously, if a table has column masking policies, we replace its
unanalyzed TableRef with an analyzed InlineViewRef (table masking view)
in FromClause.analyze(). However, we can't detect which columns are
actually used in the original query at this point. In fact, analyze()
for SelectList, WhereClause, GroupByClause and other clauses containing
SlotRefs happen after FromClause.analyze(). After the whole query block
is analyzed, we can get the exact set of required columns.

This patch refactor the codes to do table masking after analyze() to
avoid introducing unused columns. Referenced columns of a TableRef are
registered in analyze(), which helps to figure out what columns are
actually needed.

With this, we don't need to revert table masking in FromClause.reset().
The doTableMasking flag in AST is also removed since now the table mask
is resolved once after analyze().

Tests:
 - Run column masking and row filtering tests in test_ranger.py
 - Run FE audit tests

Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
---
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtNode.java
M fe/src/main/java/org/apache/impala/analysis/Subquery.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/WithClause.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
20 files changed, 335 insertions(+), 226 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17199/3
--
To view, visit http://gerrit.cloudera.org:8080/17199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Gerrit-Change-Number: 17199
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17199 )

Change subject: IMPALA-9661: Avoid introducing unused columns in table masking 
view
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8400/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Gerrit-Change-Number: 17199
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 19 Mar 2021 02:20:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17168 )

Change subject: IMPALA-10564: Return error when inserting an invalid decimal 
value
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8399/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Gerrit-Change-Number: 17168
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 19 Mar 2021 02:12:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17168 )

Change subject: IMPALA-10564: Return error when inserting an invalid decimal 
value
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8398/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Gerrit-Change-Number: 17168
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 19 Mar 2021 02:03:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view

2021-03-18 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17199 )

Change subject: IMPALA-9661: Avoid introducing unused columns in table masking 
view
..


Patch Set 2:

(1 comment)

Fix a test failure in AnalyzeStmtsTest. Slightly refactor some codes.

http://gerrit.cloudera.org:8080/#/c/17199/1/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
File 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java:

http://gerrit.cloudera.org:8080/#/c/17199/1/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@359
PS1, Line 359: assertEventEquals("@column", "select", 
"functional/alltypestiny/date_string_col",
> line too long (92 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/17199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Gerrit-Change-Number: 17199
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 19 Mar 2021 02:00:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view

2021-03-18 Thread Quanlong Huang (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17199

to look at the new patch set (#2).

Change subject: IMPALA-9661: Avoid introducing unused columns in table masking 
view
..

IMPALA-9661: Avoid introducing unused columns in table masking view

Previously, if a table has column masking policies, we replace its
unanalyzed TableRef with an analyzed InlineViewRef (table masking view)
in FromClause.analyze(). However, we can't detect which columns are
actually used in the original query at this point. In fact, analyze()
for SelectList, WhereClause, GroupByClause and other clauses containing
SlotRefs happen after FromClause.analyze(). After the whole query block
is analyzed, we can get the exact set of required columns.

This patch refactor the codes to do table masking after analyze() to
avoid introducing unused columns. Referenced columns of a TableRef are
registered in analyze(), which helps to figure out what columns are
actually needed.

With this, we don't need to revert table masking in FromClause.reset().
The doTableMasking flag in AST is also removed since now the table mask
is resolved once after analyze().

Tests:
 - Run column masking and row filtering tests in test_ranger.py
 - Run FE audit tests

Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
---
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtNode.java
M fe/src/main/java/org/apache/impala/analysis/Subquery.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/WithClause.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
20 files changed, 270 insertions(+), 226 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17199/2
--
To view, visit http://gerrit.cloudera.org:8080/17199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Gerrit-Change-Number: 17199
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value

2021-03-18 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/17168 )

Change subject: IMPALA-10564: Return error when inserting an invalid decimal 
value
..

IMPALA-10564: Return error when inserting an invalid decimal value

When using CTAS statements or INSERT-SELECT statements to insert rows to
table with decimal columns, Impala insert NULL for overflowed decimal
values, instead of returning error. This issue happens when the data
expression for the decimal column in SELECT sub-query consists at least
one alias.
This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the
issue for the cases with the data expression for the decimal columns as
constants so that the overflowed decimal values could be detected by
frontend during expression analysis. If there is an alias (variable) in
the data expression for the decimal column, only backend could detect
decimal overflow.

This patch added a query option use_null_for_decimal_errors. When it
is disabled, backend checks the query status of RuntimeState in
Table Writer when ScalarExprEvaluator return NULL for decimal column.
If there is an invalid deciaml error, the query will be failed without
inserting NULL for decimal column. If use_null_for_decimal_errors is
enabled, NULL will be inserted into table for invalid decimal value.
We did not change the behaviour for decimal_v1. NULL will be inserted
to the table for invalid decimal values with warning message.

Tests:
 - Manually ran queries with overflowed decimal values by using CTAS
   and INSERT-SELECT statements. Verified that queries failed without
   inserting NULL as expected if use_null_for_decimal_errors was set
   as false, and NULLs were inserted into the table for overflowed
   decimal if use_null_for_decimal_errors was set as true.
 - Manually ran queries with overflowed decimal values and decimal_v2
   set as false. The result is same as before - NULLs were inserted
   to table for invalid decimal values with warning message.
 - Added unit-tests for INSERT-SELECT and CTAS.
 - Passed core tests.

Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
---
M be/src/common/status.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-text-table-writer.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exprs/decimal-operators-ir.cc
M be/src/runtime/runtime-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/udf/udf.cc
M be/src/udf/udf.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M common/thrift/generate_error_codes.py
A 
testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test
M tests/query_test/test_decimal_queries.py
16 files changed, 285 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/17168/7
--
To view, visit http://gerrit.cloudera.org:8080/17168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Gerrit-Change-Number: 17168
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value

2021-03-18 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/17168 )

Change subject: IMPALA-10564: Return error when inserting an invalid decimal 
value
..

IMPALA-10564: Return error when inserting an invalid decimal value

When using CTAS statements or INSERT-SELECT statements to insert rows to
table with decimal columns, Impala insert NULL for overflowed decimal
values, instead of returning error. This issue happens when the data
expression for the decimal column in SELECT sub-query consists at least
one alias.
This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the
issue for the cases with the data expression for the decimal columns as
constants so that the overflowed decimal values could be detected by
frontend during expression analysis. If there is an alias (variable) in
the data expression for the decimal column, only backend could detect
decimal overflow.

This patch added a query option use_null_for_decimal_errors. When it
is disabled, backend checks the query status of RuntimeState in
Table Writer when ScalarExprEvaluator return NULL for decimal column.
If there is an invalid deciaml error, the query will be failed without
inserting NULL for decimal column. If use_null_for_decimal_errors is
enabled, NULL will be inserted into table for invalid decimal value.
We did not change the behaviour for decimal_v1. NULL will be inserted
to the table for invalid decimal values with warning message.

Tests:
 - Manually ran queries with overflowed decimal values by using CTAS
   and INSERT-SELECT statements. Verified that queries failed without
   inserting NULL as expected if use_null_for_decimal_errors was set
   as false, and NULLs were inserted into the table for overflowed
   decimal if use_null_for_decimal_errors was set as true.
 - Manually ran queries with overflowed decimal values and decimal_v2
   set as false. The result is same as before - NULLs were inserted
   to table for invalid decimal values with warning message.
 - Added unit-tests for INSERT-SELECT and CTAS.
 - Passed core tests.

Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
---
M be/src/common/status.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-text-table-writer.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exprs/decimal-operators-ir.cc
M be/src/runtime/runtime-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/udf/udf.cc
M be/src/udf/udf.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M common/thrift/generate_error_codes.py
A 
testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test
M tests/query_test/test_decimal_queries.py
16 files changed, 285 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/17168/6
--
To view, visit http://gerrit.cloudera.org:8080/17168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Gerrit-Change-Number: 17168
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value

2021-03-18 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17168 )

Change subject: IMPALA-10564: Return error when inserting an invalid decimal 
value
..


Patch Set 5: Code-Review+2

LGTM. Let me know if you want anyone else to review.  Otherwise, I can start a 
gerrit-verify run.


--
To view, visit http://gerrit.cloudera.org:8080/17168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Gerrit-Change-Number: 17168
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 19 Mar 2021 00:56:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17168 )

Change subject: IMPALA-10564: Return error when inserting an invalid decimal 
value
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8397/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Gerrit-Change-Number: 17168
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 19 Mar 2021 00:33:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value

2021-03-18 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/17168 )

Change subject: IMPALA-10564: Return error when inserting an invalid decimal 
value
..

IMPALA-10564: Return error when inserting an invalid decimal value

When using CTAS statements or INSERT-SELECT statements to insert rows to
table with decimal columns, Impala insert NULL for overflowed decimal
values, instead of returning error. This issue happens when the data
expression for the decimal column in SELECT sub-query consists at least
one alias.
This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the
issue for the cases with the data expression for the decimal columns as
constants so that the overflowed decimal values could be detected by
frontend during expression analysis. If there is an alias (variable) in
the data expression for the decimal column, only backend could detect
decimal overflow.

This patch added a query option use_null_for_decimal_errors. When it
is disabled, backend checks the query status of RuntimeState in
Table Writer when ScalarExprEvaluator return NULL for decimal column.
If there is an invalid deciaml error, the query will be failed without
inserting NULL for decimal column. If use_null_for_decimal_errors is
enabled, NULL will be inserted into table for invalid decimal value.
We did not change the behaviour for decimal_v1. NULL will be inserted
to the table for invalid decimal values with warning message.

Tests:
 - Manually ran queries with overflowed decimal values by using CTAS
   and INSERT-SELECT statements. Verified that queries failed without
   inserting NULL as expected if use_null_for_decimal_errors was set
   as false, and NULLs were inserted into the table for overflowed
   decimal if use_null_for_decimal_errors was set as true.
 - Manually ran queries with overflowed decimal values and decimal_v2
   set as false. The result is same as before - NULLs were inserted
   to table for invalid decimal values with warning message.
 - Added unit-tests for INSERT-SELECT and CTAS.
 - Passed core tests.

Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
---
M be/src/common/status.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-text-table-writer.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exprs/decimal-operators-ir.cc
M be/src/runtime/runtime-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/udf/udf.cc
M be/src/udf/udf.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M common/thrift/generate_error_codes.py
A 
testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test
M tests/query_test/test_decimal_queries.py
16 files changed, 283 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/17168/5
--
To view, visit http://gerrit.cloudera.org:8080/17168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Gerrit-Change-Number: 17168
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17185 )

Change subject: IMPALA-10483: Support subqueries in Ranger masking policies
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8396/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17185
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f
Gerrit-Change-Number: 17185
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 18 Mar 2021 23:26:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value

2021-03-18 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17168 )

Change subject: IMPALA-10564: Return error when inserting an invalid decimal 
value
..


Patch Set 4: Code-Review+1

(7 comments)

Thanks for making the changes. A few mostly nits and one comment about the test.

http://gerrit.cloudera.org:8080/#/c/17168/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17168/1//COMMIT_MSG@7
PS1, Line 7: lid decima
> Done
Done


http://gerrit.cloudera.org:8080/#/c/17168/1/be/src/exec/hdfs-text-table-writer.cc
File be/src/exec/hdfs-text-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/17168/1/be/src/exec/hdfs-text-table-writer.cc@107
PS1, Line 107: invalid de
> Done
Done


http://gerrit.cloudera.org:8080/#/c/17168/4/be/src/exec/hdfs-text-table-writer.cc
File be/src/exec/hdfs-text-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/17168/4/be/src/exec/hdfs-text-table-writer.cc@107
PS4, Line 107:   // IMPALA-10564: For invalid decimal value, we should 
return an error
nit: the comment indicates this is done unconditionally. Could you update it 
similar to other places that it is done based on the query option.


http://gerrit.cloudera.org:8080/#/c/17168/4/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

http://gerrit.cloudera.org:8080/#/c/17168/4/be/src/exec/kudu-table-sink.cc@253
PS4, Line 253: // IMPALA-10564: For invalid decimal value, we should 
return an error
nit: same comment as above.


http://gerrit.cloudera.org:8080/#/c/17168/4/be/src/exec/parquet/hdfs-parquet-table-writer.cc
File be/src/exec/parquet/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/17168/4/be/src/exec/parquet/hdfs-parquet-table-writer.cc@698
PS4, Line 698:   // IMPALA-10564: For invalid decimal value, we should 
return an error
nit: same as above


http://gerrit.cloudera.org:8080/#/c/17168/4/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

http://gerrit.cloudera.org:8080/#/c/17168/4/common/thrift/generate_error_codes.py@475
PS4, Line 475: type
'value' would be more accurate than 'type' to avoid confusion with decimal type 
precision and scale.


http://gerrit.cloudera.org:8080/#/c/17168/4/testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test
File 
testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test:

http://gerrit.cloudera.org:8080/#/c/17168/4/testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test@43
PS4, Line 43:  RESULTS
This verifies if a row was inserted but Is there a way to check that the 
inserted value is NULL ? I was thinking of something like 'select count(*) from 
overflowed_decimal_tbl where  is null' .  Although, since you are using 
the same table for the insertions, the count will keep growing.



--
To view, visit http://gerrit.cloudera.org:8080/17168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Gerrit-Change-Number: 17168
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 18 Mar 2021 23:18:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies

2021-03-18 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17185 )

Change subject: IMPALA-10483: Support subqueries in Ranger masking policies
..


Patch Set 3:

(1 comment)

> Patch Set 3:
>
> (1 comment)

Forgot some changes in last commit. Uploaded them in PS4.

http://gerrit.cloudera.org:8080/#/c/17185/3/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/17185/3/tests/authorization/test_ranger.py@1270
PS3, Line 1270:
> flake8: E251 unexpected spaces around keyword / parameter equals
Done



--
To view, visit http://gerrit.cloudera.org:8080/17185
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f
Gerrit-Change-Number: 17185
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 18 Mar 2021 23:06:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies

2021-03-18 Thread Quanlong Huang (Code Review)
Hello Aman Sinha, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17185

to look at the new patch set (#4).

Change subject: IMPALA-10483: Support subqueries in Ranger masking policies
..

IMPALA-10483: Support subqueries in Ranger masking policies

This patch adds support for using subqueries in Ranger masking policies,
i.e. column-masking/row-filtering policies. The subquery can reference
either the current table or other tables. However, masking policies on
these tables won't be applied recursively. This is consistent with Hive.
One motivation is to avoid infinitely masking if it references the same
table. Another motivation I think is to simplify the masking behavior,
so when the admin is setting a masking expression, it can be considered
as running in the admin's perspective (i.e. no masking).

Implementation
Before analyzing the query, the coordinator loads the metadata of all
possibly used tables into the query's StmtTableCache. Table masking
takes place after the analyzing phase. If the subquery filter introduces
any new tables, the analyzer will fail to resolve them since their
metadata is not loaded in the StmtTableCache. This patch modified the
StmtMetadataLoader to also load those tables introduced by masking
policies. So they can be resolved correctly.

Tests
 - Add more complex tests in test_row_filtering

Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f
---
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
8 files changed, 295 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/17185/4
--
To view, visit http://gerrit.cloudera.org:8080/17185
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f
Gerrit-Change-Number: 17185
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17168 )

Change subject: IMPALA-10564: Return error when inserting an invalid decimal 
value
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8395/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Gerrit-Change-Number: 17168
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 18 Mar 2021 22:17:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10564: Return error when inserting an invalid decimal value

2021-03-18 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/17168 )

Change subject: IMPALA-10564: Return error when inserting an invalid decimal 
value
..

IMPALA-10564: Return error when inserting an invalid decimal value

When using CTAS statements or INSERT-SELECT statements to insert rows to
table with decimal columns, Impala insert NULL for overflowed decimal
values, instead of returning error. This issue happens when the data
expression for the decimal column in SELECT sub-query consists at least
one alias.
This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the
issue for the cases with the data expression for the decimal columns as
constants so that the overflowed decimal values could be detected by
frontend during expression analysis. If there is an alias (variable) in
the data expression for the decimal column, only backend could detect
decimal overflow.

This patch added a query option use_null_for_decimal_errors. When it
is disabled, backend checks the query status of RuntimeState in
Table Writer when ScalarExprEvaluator return NULL for decimal column.
If there is an invalid deciaml error, the query will be failed without
inserting NULL for decimal column. If use_null_for_decimal_errors is
enabled, NULL will be inserted into table for invalid decimal value.
We did not change the behaviour for decimal_v1. NULL will be inserted
to the table for invalid decimal values with warning message.

Tests:
 - Manually ran queries with overflowed decimal values by using CTAS
   and INSERT-SELECT statements. Verified that queries failed without
   inserting NULL as expected if use_null_for_decimal_errors was set
   as false, and NULLs were inserted into the table for overflowed
   decimal if use_null_for_decimal_errors was set as true.
 - Manually ran queries with overflowed decimal values and decimal_v2
   set as false. The result is same as before - NULLs were inserted
   to table for invalid decimal values with warning message.
 - Added unit-tests for INSERT-SELECT and CTAS.
 - Passed core tests.

Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
---
M be/src/common/status.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-text-table-writer.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exprs/decimal-operators-ir.cc
M be/src/runtime/runtime-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/udf/udf.cc
M be/src/udf/udf.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M common/thrift/generate_error_codes.py
A 
testdata/workloads/functional-query/queries/QueryTest/decimal-insert-overflow-exprs.test
M tests/query_test/test_decimal_queries.py
16 files changed, 272 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/17168/4
--
To view, visit http://gerrit.cloudera.org:8080/17168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Gerrit-Change-Number: 17168
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-10552: Support external frontends supplying timeline for profile

2021-03-18 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17183 )

Change subject: IMPALA-10552: Support external frontends supplying timeline for 
profile
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17183/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/17183/1/be/src/service/impala-server.cc@1212
PS1, Line 1212: 
(*query_handle)->set_user_profile_access(result.user_has_profile_access);
I think this line got duplicated, maybe a merge conflict resolution error?



-- 
To view, visit http://gerrit.cloudera.org:8080/17183
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b3692b4118ea23c0f9f8ec4bcc27b0b68bb32ec
Gerrit-Change-Number: 17183
Gerrit-PatchSet: 1
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 18 Mar 2021 21:46:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
..

IMPALA-9234: Support Ranger row filtering policies

Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default. Enabling row-filtering requires
--enable_column_masking=true since it depends on the column masking
implementation.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

Limitations:
 - Expressions using subqueries are not supported (IMPALA-10483).
 - Row filtering policies on nested tables will not be applied when
   nested collection columns are used directly in the FROM clause. This
   will leak data so we forbid such kinds of queries until IMPALA-10484
   is resolved.

Tests:
 - Add FE test for error message when disabling row filtering.
 - Add e2e test with row filtering policies.
 - Add e2e test with column masking and row filtering policies both take
   place.
 - Verified audits in a CDP cluster with Ranger and Solr set up.

Change-Id: I580517be241225ca15e45686381b78890178d7cc
Reviewed-on: http://gerrit.cloudera.org:8080/16976
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/AuthorizationUtil.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
M 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
A 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
A 
testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
23 files changed, 1,005 insertions(+), 113 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 13
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
..


Patch Set 12: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 12
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 18 Mar 2021 21:08:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10494: Making use of the min/max column stats to improve min/max filters

2021-03-18 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17075 )

Change subject: IMPALA-10494: Making use of the min/max column stats to improve 
min/max filters
..


Patch Set 22:

Aman asked whether min/max filters are applied in the context of out-joins. The 
answer is no.  Please refer to TPCDS q49.


--
To view, visit http://gerrit.cloudera.org:8080/17075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08581b44419bb8da5940cbf98502132acd1c86df
Gerrit-Change-Number: 17075
Gerrit-PatchSet: 22
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 18 Mar 2021 20:27:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10494: Making use of the min/max column stats to improve min/max filters

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17075 )

Change subject: IMPALA-10494: Making use of the min/max column stats to improve 
min/max filters
..


Patch Set 22:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8394/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08581b44419bb8da5940cbf98502132acd1c86df
Gerrit-Change-Number: 17075
Gerrit-PatchSet: 22
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 18 Mar 2021 20:17:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10494: Making use of the min/max column stats to improve min/max filters

2021-03-18 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#22). ( 
http://gerrit.cloudera.org:8080/17075 )

Change subject: IMPALA-10494: Making use of the min/max column stats to improve 
min/max filters
..

IMPALA-10494: Making use of the min/max column stats to improve min/max filters

This patch adds the functionality to compute the minimal and the maximal
value for a column of type integers, float or double for parquet tables,
and to make use of the new stats to discard min/max filters, in both hash
join builders and Parquet scanners, whose coverage are too close to the
actual range defined by the column min and max.

The computation and dislay of the new column min/max stats are done
for Parquet tables only and can be controlled by two new Boolean query
options (default to false):
  1. compute_column_minmax_stats
  2. show_column_minmax_stats

Usage examples.

  set compute_column_minmax_stats=true;
  compute stats tpcds_parquet.store_sales;

  set show_column_minmax_stats=true;
  show column stats tpcds_parquet.store_sales;

+---+--+-...---+-+-+
| Column| Type |   #Falses | Min | Max |
+---+--+-...---+-+-+
| ss_sold_time_sk   | INT  |   -1  | 28800   | 75599   |
| ss_item_sk| BIGINT   |   -1  | 1   | 18000   |
| ss_customer_sk| INT  |   -1  | 1   | 10  |
| ss_cdemo_sk   | INT  |   -1  | 15  | 1920797 |
| ss_hdemo_sk   | INT  |   -1  | 1   | 7200|
| ss_addr_sk| INT  |   -1  | 1   | 5   |
| ss_store_sk   | INT  |   -1  | 1   | 10  |
| ss_promo_sk   | INT  |   -1  | 1   | 300 |
| ss_ticket_number  | BIGINT   |   -1  | 1   | 24  |
| ss_quantity   | INT  |   -1  | 1   | 100 |
| ss_wholesale_cost | DECIMAL(7,2) |   -1  | -1  | -1  |
| ss_list_price | DECIMAL(7,2) |   -1  | -1  | -1  |
| ss_sales_price| DECIMAL(7,2) |   -1  | -1  | -1  |
| ss_ext_discount_amt   | DECIMAL(7,2) |   -1  | -1  | -1  |
| ss_ext_sales_price| DECIMAL(7,2) |   -1  | -1  | -1  |
| ss_ext_wholesale_cost | DECIMAL(7,2) |   -1  | -1  | -1  |
| ss_ext_list_price | DECIMAL(7,2) |   -1  | -1  | -1  |
| ss_ext_tax| DECIMAL(7,2) |   -1  | -1  | -1  |
| ss_coupon_amt | DECIMAL(7,2) |   -1  | -1  | -1  |
| ss_net_paid   | DECIMAL(7,2) |   -1  | -1  | -1  |
| ss_net_paid_inc_tax   | DECIMAL(7,2) |   -1  | -1  | -1  |
| ss_net_profit | DECIMAL(7,2) |   -1  | -1  | -1  |
| ss_sold_date_sk   | INT  |   -1  | 2450816 | 2452642 |
+---+--+-...---+-+-+

Only the min/max values for non-partition columns are stored in HMS.
The min/max values for partition columns are computed in coordinator.

The min-max filters, in C++ class or protobuf form, are augmented to
deal with the always true state better. Once always true is set, the
actual min and max values in the filter are no longer populated.

Testing:
 - Added new compute/show stats tests for integers, float and double
   column data types in compute-stats-column-minmax.test;
 - Added new tests in overlap_min_max_filters.test to demonstrate the
   usefulness of column stats to quickly disable useless filters in
   both hash join builder and Parquet scanner;
 - Added tests in min-max-filter-test.cc to demonstrate method Or(),
   ToProtobuf() and constructor can deal with always true flag well;
 - core tests.

TODO:
 1. Test compute stats for timestamp and date columns;
 2. Enable the feature for Iceberg tables with Parquet data files.

Change-Id: I08581b44419bb8da5940cbf98502132acd1c86df
---
M be/src/exec/catalog-op-executor.cc
M be/src/exec/filter-context.cc
M be/src/exec/filter-context.h
M be/src/exec/hdfs-scanner.h
M be/src/exec/incr-stats-util-test.cc
M be/src/exec/incr-stats-util.cc
M be/src/exec/incr-stats-util.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/service/hs2-util.cc
M be/src/service/hs2-util.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/min-max-filter-test.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
M common/thrift/CatalogObjects.thrift
M common/thrift/Frontend.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
M fe/src/main/java/org/apac

[Impala-ASF-CR] IMPALA-9470: Use Parquet Bloom filters - Part 1

2021-03-18 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17026 )

Change subject: IMPALA-9470: Use Parquet Bloom filters - Part 1
..


Patch Set 17:

(25 comments)

http://gerrit.cloudera.org:8080/#/c/17026/17//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17026/17//COMMIT_MSG@7
PS17, Line 7: Part 1
Can you add some info about what is expected in later parts?


http://gerrit.cloudera.org:8080/#/c/17026/17//COMMIT_MSG@26
PS17, Line 26: Testing:
It would be great to add some unit tests for ParquetBloomFilter, especially if 
there are paths that are not used in the EE test.


http://gerrit.cloudera.org:8080/#/c/17026/17//COMMIT_MSG@28
PS17, Line 28: Parquet Bloom filtering works for the supported types and 
that we do
Please mention that we use a Parquet file generated by some other tool. This 
info should be also added to 
https://github.com/apache/impala/blob/aeeff53e884a67ee7f5980654a1d394c6e3e34ac/testdata/data/README


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.h
File be/src/exec/parquet/hdfs-parquet-scanner.h:

http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.h@529
PS17, Line 529: buffer_pool_client_
BufferPool::ClientHandle should be only used from a single thread:
https://github.com/apache/impala/blob/master/be/src/runtime/bufferpool/buffer-pool.h#L332

I think that can cause problems if there are multiple scanners for a single 
scan node.


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.h@801
PS17, Line 801:   static bool IsParquetBloomFilterSupported(parquet::Type::type 
parquet_type,
There may be better places for these functions, e.g. ParquetMetadataUtils, 
ParquetCommon, or probably a separate file for Parquet bloom filter related 
stuff.


http://gerrit.cloudera.org:8080/#/c/17026/3/be/src/exec/parquet/hdfs-parquet-scanner.h
File be/src/exec/parquet/hdfs-parquet-scanner.h:

http://gerrit.cloudera.org:8080/#/c/17026/3/be/src/exec/parquet/hdfs-parquet-scanner.h@704
PS3, Line 704:   ///
It could be noted that this is read from metadata_range_.


http://gerrit.cloudera.org:8080/#/c/17026/3/be/src/exec/parquet/hdfs-parquet-scanner.h@718
PS3, Line 718:   /// Decides how to divide stream_->reservation() between the 
columns. May increase
consistency: EvalDictionaryFilters uses skip_row_group for the same purpose.


http://gerrit.cloudera.org:8080/#/c/17026/3/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17026/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@1108
PS3, Line 1108: nst string& fn_
A few DCHECKs would be nice, e.g. to ensure that metadata_range_ is filled.


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@843
PS17, Line 843:   continue;
This means that we will skip the row group without raising an y counters. I 
think that we should process the row group if there are issues with the 
bloomfilter.


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1470
PS17, Line 1470: FindChildSlotRef
Is this needed for any supported types? e.g. char(N) in the example shouldn't 
be supported.


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1684
PS17, Line 1684:   const int8_t* const cast_value = reinterpret_cast(value);
   :   const int byte_len = 
ParquetPlainEncoder::Encode(*cast_value,
   :   -1 /* fixed_len_size */, storage->data());
   :   DCHECK_EQ(byte_len, output_len);
Create a template function for this code?


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1729
PS17, Line 1729: const int exp_size = 
ParquetPlainEncoder::ByteSize(*cast_value);
   : storage->resize(exp_size);
if this was moved out, then the same template function could be used as the one 
mentioned at line 1684


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1825
PS17, Line 1825: __isset.meta_data
We shouldn't need to check this here.


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1834
PS17, Line 1834: &header_size, &bloom_filter_header));
nit: too much indentation


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1842
PS17, Line 1842:   return Status(Substitute("Could not allocate buffer 
of $0 bytes for Parquet "
nit: too much indentation


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@1857
PS17, Line 1857: data_buffer.buffer() + data_alr

[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..

IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

ALTER TABLE ADD PARTITION should bump the write id for ACID tables.
Both for INSERT-only and full ACID tables.

For transational tables we are adding partitions in an ACID
transaction in the following sequence:

1. open transaction
2. allocate write id for table
3. add partitions to HMS table
4. commit transaction

However, please note that table metadata modifications are
independent of ACID transactions. I.e. if add partitions succeed,
but we cannot commit the transaction, then we the newly added
partitions won't get removed.

So why are we opening a txn then? We are doing it in order to bump
the write id in a best-effort way. This aids table metadata caching,
so by looking at the table write id we can determine if the cached
table metadata is up-to-date.

Testing:
 * added e2e test

Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Reviewed-on: http://gerrit.cloudera.org:8080/17081
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test
M tests/query_test/test_acid.py
7 files changed, 133 insertions(+), 41 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 18 Mar 2021 19:35:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10590: Introduce admission service heartbeat mechanism

2021-03-18 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17194 )

Change subject: IMPALA-10590: Introduce admission service heartbeat mechanism
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-control-service.cc
File be/src/scheduling/admission-control-service.cc:

http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-control-service.cc@265
PS1, Line 265: AdmissionHeartbeat
what happens when a coord dies? does the AC service detect that and remove all 
queries for that host?
if yes, we should also remove the entry for that coord in running_queries_ 
otherwise we might end up with empty map entries whenever a coord restarts.


http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-controller.h@835
PS1, Line 835: Map from host id to maps from
nit: Map from host id to a map of query id


http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-controller.cc@1361
PS1, Line 1361: / In the context of the admission control service, this may 
happen, eg. if a
  :   // ReleaseQuery rpc is reported as failed to the 
coordinator but actually ends up
  :   // arriving much later, so only log at WARNING level.
can you document the remote client's behavior for failed RPCs in its class 
comment


http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-controller.cc@1364
PS1, Line 1364:   LOG(WARNING) << "Unable to find resources to release for 
query "
nit: maybe add to the log message that it might have already been released.


http://gerrit.cloudera.org:8080/#/c/17194/1/be/src/scheduling/admission-controller.cc@1397
PS1, Line 1397: LOG(DFATAL)
should this be a warning now too?



--
To view, visit http://gerrit.cloudera.org:8080/17194
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia528d92268cea487ada20b476935a81166f5ad34
Gerrit-Change-Number: 17194
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 18 Mar 2021 16:34:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8393/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 2
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 18 Mar 2021 16:12:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

2021-03-18 Thread Fucun Chu (Code Review)
Fucun Chu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17186


Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
..

IMPALA-10581: Implement ds_theta_intersect_f() function

This function receives two strings that are serialized Apache
DataSketches Theta sketches. Computes the intersection of two sketches
of same or different column and returns the resulting sketch of
intersection.

Example:
select ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2))
from sketch_tbl;
+---+
| ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2)) |
+---+
| 5 |
+---+

Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
---
M be/src/exprs/datasketches-functions-ir.cc
M be/src/exprs/datasketches-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test
4 files changed, 119 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/17186/2
--
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 2
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10580: Implement ds theta union f() function

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17179 )

Change subject: IMPALA-10580: Implement ds_theta_union_f() function
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8392/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17179
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8329979b81ceeaad739a43fab79768ca9c2916fa
Gerrit-Change-Number: 17179
Gerrit-PatchSet: 3
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 18 Mar 2021 15:27:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6982/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 12
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 18 Mar 2021 15:26:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 12
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 18 Mar 2021 15:26:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

2021-03-18 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 11
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 18 Mar 2021 15:25:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10580: Implement ds theta union f() function

2021-03-18 Thread Fucun Chu (Code Review)
Fucun Chu has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/17179 )

Change subject: IMPALA-10580: Implement ds_theta_union_f() function
..

IMPALA-10580: Implement ds_theta_union_f() function

This function receives two strings that are serialized Apache
DataSketches Theta sketches. Union two sketches and returns the
resulting sketch of union.

Example:
select ds_theta_estimate(ds_theta_union_f(sketch1, sketch2))
from sketch_tbl;
+---+
| ds_theta_estimate(ds_theta_union_f(sketch1, sketch2)) |
+---+
| 15|
+---+

Change-Id: I8329979b81ceeaad739a43fab79768ca9c2916fa
---
M be/src/exprs/datasketches-functions-ir.cc
M be/src/exprs/datasketches-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test
4 files changed, 111 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/17179/3
--
To view, visit http://gerrit.cloudera.org:8080/17179
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8329979b81ceeaad739a43fab79768ca9c2916fa
Gerrit-Change-Number: 17179
Gerrit-PatchSet: 3
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10593: Skip runtime filter for outer joins...

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17200 )

Change subject: IMPALA-10593: Skip runtime filter for outer joins...
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8391/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46462e2030731d97c4c88e364148c0093c025ab3
Gerrit-Change-Number: 17200
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 18 Mar 2021 14:43:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10593: Skip runtime filter for outer joins...

2021-03-18 Thread Steve Carlin (Code Review)
Steve Carlin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17200


Change subject: IMPALA-10593: Skip runtime filter for outer joins...
..

IMPALA-10593: Skip runtime filter for outer joins...

...when Expr not constant after null substitution.

Currently there is code that asserts that an Expr is not constant after
substituting SlotRefs with constant nulls.

A third party tool needs this restriction to be weakened.  In a case where
an Expr is checked and the Expr is not constant even after substituting
nulls, the result will be to not generate a runtime filter for that Expr.

Change-Id: I46462e2030731d97c4c88e364148c0093c025ab3
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
2 files changed, 8 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/17200/1
--
To view, visit http://gerrit.cloudera.org:8080/17200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I46462e2030731d97c4c88e364148c0093c025ab3
Gerrit-Change-Number: 17200
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin 


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8390/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 18 Mar 2021 14:10:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17185 )

Change subject: IMPALA-10483: Support subqueries in Ranger masking policies
..


Patch Set 3:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/8389/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/17185
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f
Gerrit-Change-Number: 17185
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 18 Mar 2021 14:01:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 18 Mar 2021 13:52:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6981/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 18 Mar 2021 13:52:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-03-18 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..


Patch Set 6: Code-Review+2

We use ADD PARTITION when creating 'alltypestiny', and we check the write id in 
full-acid-rowid.test.

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 18 Mar 2021 13:51:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10483(part-1): Refactor table mask resolving

2021-03-18 Thread Quanlong Huang (Code Review)
Quanlong Huang has abandoned this change. ( 
http://gerrit.cloudera.org:8080/17184 )

Change subject: IMPALA-10483(part-1): Refactor table mask resolving
..


Abandoned

Abandon since we have a better solution: https://gerrit.cloudera.org/c/17199
--
To view, visit http://gerrit.cloudera.org:8080/17184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ia191928fb179b0b0632235c1fff4c18647e5802f
Gerrit-Change-Number: 17184
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies

2021-03-18 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17185 )

Change subject: IMPALA-10483: Support subqueries in Ranger masking policies
..


Patch Set 3:

(3 comments)

Rebased the patch to base on https://gerrit.cloudera.org/c/17199

http://gerrit.cloudera.org:8080/#/c/17185/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17185/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-10483: Support subqueries in Ranger masking policies
> I think the code changes in the patch are straightforward.  Regarding testi
I think COMPUTE STATS should be blocked since it required ALTER privilege (same 
as the issue in IMPALA-10554). The target user can only SELECT the table. Let's 
deal with such issues in IMPALA-10554 together.


http://gerrit.cloudera.org:8080/#/c/17185/2/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
File 
testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/17185/2/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@167
PS2, Line 167: INT,BOOLEAN,STRING
> A few questions/comments:
The row filter can have any expressions as long as they are correct in syntax 
and semantic. Would you won't more complex row filters in tests?


http://gerrit.cloudera.org:8080/#/c/17185/2/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/17185/2/tests/authorization/test_ranger.py@1232
PS2, Line 1232:   admin_client.execute("grant select on database tpch to 
user %s" % user)
> In this row filter would a correlation condition contained entirely within 
> the row filter be ok ?
 e.g  ..select n_nationkey from nation n1 where n_name in (select n_name from 
nation n2 where n1.n_regionkey = n2.n_regionkey).

Yeah, it should work. But this filter don't contain 'current_user()' so the 
policy will have the same effects for all users. Let me try to add a similar 
test.

> could we also add a negative test where the correlation is to a table in the 
> parent query, not in the row filter itself.  That one is expected to fail.  
> (Maybe you already have this test .. if so, feel free to ignore).

Yeah, I think the tests in ranger_row_filtering.test about 'test_id' satisfy 
these.



--
To view, visit http://gerrit.cloudera.org:8080/17185
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f
Gerrit-Change-Number: 17185
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 18 Mar 2021 13:50:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17185 )

Change subject: IMPALA-10483: Support subqueries in Ranger masking policies
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17185/3/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/17185/3/tests/authorization/test_ranger.py@1270
PS3, Line 1270:
flake8: E251 unexpected spaces around keyword / parameter equals



--
To view, visit http://gerrit.cloudera.org:8080/17185
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f
Gerrit-Change-Number: 17185
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 18 Mar 2021 13:50:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-03-18 Thread Zoltan Borok-Nagy (Code Review)
Hello Vihang Karajgaonkar, Gabor Kaszab, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17081

to look at the new patch set (#6).

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..

IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

ALTER TABLE ADD PARTITION should bump the write id for ACID tables.
Both for INSERT-only and full ACID tables.

For transational tables we are adding partitions in an ACID
transaction in the following sequence:

1. open transaction
2. allocate write id for table
3. add partitions to HMS table
4. commit transaction

However, please note that table metadata modifications are
independent of ACID transactions. I.e. if add partitions succeed,
but we cannot commit the transaction, then we the newly added
partitions won't get removed.

So why are we opening a txn then? We are doing it in order to bump
the write id in a best-effort way. This aids table metadata caching,
so by looking at the table write id we can determine if the cached
table metadata is up-to-date.

Testing:
 * added e2e test

Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
---
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test
M tests/query_test/test_acid.py
7 files changed, 133 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/17081/6
--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10483: Support subqueries in Ranger masking policies

2021-03-18 Thread Quanlong Huang (Code Review)
Hello Aman Sinha, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17185

to look at the new patch set (#3).

Change subject: IMPALA-10483: Support subqueries in Ranger masking policies
..

IMPALA-10483: Support subqueries in Ranger masking policies

This patch adds support for using subqueries in Ranger masking policies,
i.e. column-masking/row-filtering policies. The subquery can reference
either the current table or other tables. However, masking policies on
these tables won't be applied recursively. This is consistent with Hive.
One motivation is to avoid infinitely masking if it references the same
table. Another motivation I think is to simplify the masking behavior,
so when the admin is setting a masking expression, it can be considered
as running in the admin's perspective (i.e. no masking).

Implementation
Before analyzing the query, the coordinator loads the metadata of all
possibly used tables into the query's StmtTableCache. Table masking
takes place after the analyzing phase. If the subquery filter introduces
any new tables, the analyzer will fail to resolve them since their
metadata is not loaded in the StmtTableCache. This patch modified the
StmtMetadataLoader to also load those tables introduced by masking
policies. So they can be resolved correctly.

Tests
 - Add more complex tests in test_row_filtering

Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f
---
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
8 files changed, 298 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/17185/3
--
To view, visit http://gerrit.cloudera.org:8080/17185
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f
Gerrit-Change-Number: 17185
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17199 )

Change subject: IMPALA-9661: Avoid introducing unused columns in table masking 
view
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8388/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Gerrit-Change-Number: 17199
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 18 Mar 2021 13:46:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17199 )

Change subject: IMPALA-9661: Avoid introducing unused columns in table masking 
view
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17199/1/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
File 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java:

http://gerrit.cloudera.org:8080/#/c/17199/1/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@359
PS1, Line 359: assertEventEquals("@column", "select", 
"functional/alltypestiny/date_string_col", 1,
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Gerrit-Change-Number: 17199
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 18 Mar 2021 13:27:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9661: Avoid introducing unused columns in table masking view

2021-03-18 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17199


Change subject: IMPALA-9661: Avoid introducing unused columns in table masking 
view
..

IMPALA-9661: Avoid introducing unused columns in table masking view

Previously, if a table has column masking policies, we replace its
unanalyzed TableRef with an analyzed InlineViewRef (table masking view)
in FromClause.analyze(). However, we can't detect which columns are
actually used in the original query at this point. In fact, analyze()
for SelectList, WhereClause, GroupByClause and other clauses containing
SlotRefs happen after FromClause.analyze(). After the whole query block
is analyzed, we can get the exact set of required columns.

This patch refactor the codes to do table masking after analyze() to
avoid introducing unused columns. Referenced columns of a TableRef are
registered in analyze(), which helps to figure out what columns are
actually needed.

Tests:
 - Run column masking and row filtering tests in test_ranger.py
 - Run FE audit tests

Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
---
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtNode.java
M fe/src/main/java/org/apache/impala/analysis/Subquery.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/WithClause.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
20 files changed, 267 insertions(+), 226 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17199/1
--
To view, visit http://gerrit.cloudera.org:8080/17199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Gerrit-Change-Number: 17199
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 


[Impala-ASF-CR] Revert "IMPALA-10503: testdata load hits hive memory limit errors during hive inserts"

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17191 )

Change subject: Revert "IMPALA-10503: testdata load hits hive memory limit 
errors during hive inserts"
..

Revert "IMPALA-10503: testdata load hits hive memory limit errors during hive 
inserts"

This reverts commit c60a626ac66cc7cf24080b7ea84166c70bad9b22.

Change-Id: I896c7b2457d537fa1bfe8dc29063da0b7b3df199
Reviewed-on: http://gerrit.cloudera.org:8080/17191
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/test/resources/hive-site.xml.py
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
2 files changed, 137 insertions(+), 138 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I896c7b2457d537fa1bfe8dc29063da0b7b3df199
Gerrit-Change-Number: 17191
Gerrit-PatchSet: 3
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] Revert "IMPALA-10503: testdata load hits hive memory limit errors during hive inserts"

2021-03-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17191 )

Change subject: Revert "IMPALA-10503: testdata load hits hive memory limit 
errors during hive inserts"
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I896c7b2457d537fa1bfe8dc29063da0b7b3df199
Gerrit-Change-Number: 17191
Gerrit-PatchSet: 2
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Thu, 18 Mar 2021 07:24:30 +
Gerrit-HasComments: No