[Impala-ASF-CR] IMPALA-8778: Support Apache Hudi Read Optimized Table

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14711 )

Change subject: IMPALA-8778: Support Apache Hudi Read Optimized Table
..


Patch Set 22:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5151/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14711
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf
Gerrit-Change-Number: 14711
Gerrit-PatchSet: 22
Gerrit-Owner: Yanjia Gary Li 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yanjia Gary Li 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 07 Feb 2020 05:27:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15087 )

Change subject: IMPALA-8110: Fix the Parquet stats filtering issue to correctly 
handle narrowed integer types
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5150/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15087
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8bdaf4c4b2d0c6ea26d6e9bf013afca647e53a1
Gerrit-Change-Number: 15087
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 07 Feb 2020 05:13:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15159 )

Change subject: IMPALA-9361: manually configured kerberized minicluster
..


Patch Set 7:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5297/


--
To view, visit http://gerrit.cloudera.org:8080/15159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee
Gerrit-Change-Number: 15159
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Feb 2020 05:01:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8778: Support Apache Hudi Read Optimized Table

2020-02-06 Thread Yanjia Gary Li (Code Review)
Yanjia Gary Li has uploaded a new patch set (#22). ( 
http://gerrit.cloudera.org:8080/14711 )

Change subject: IMPALA-8778: Support Apache Hudi Read Optimized Table
..

IMPALA-8778: Support Apache Hudi Read Optimized Table

Hudi Read Optimized Table contains multiple versions of parquet files,
in order to load the table correctly, Impala needs to recognize Hudi Read
Optimized Table as a HdfsTable and load the latest version of the file
using HoodieROTablePathFilter.

Tests
 - Unit test for Hudi in FileMetadataLoader
 - Create table tests in functional_schema_template.sql
 - Query tests in hudi-parquet.test

Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf
---
M bin/impala-config.sh
M bin/rat_exclude_files.txt
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A fe/src/main/java/org/apache/impala/util/HudiUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M impala-parent/pom.xml
M testdata/bin/generate-schema-statements.py
M testdata/data/README
A testdata/data/hudi_parquet/.hoodie/20200112194517.clean
A testdata/data/hudi_parquet/.hoodie/20200112194517.clean.inflight
A testdata/data/hudi_parquet/.hoodie/20200112194517.clean.requested
A testdata/data/hudi_parquet/.hoodie/20200112194517.commit
A testdata/data/hudi_parquet/.hoodie/20200112194517.commit.requested
A testdata/data/hudi_parquet/.hoodie/20200112194517.inflight
A testdata/data/hudi_parquet/.hoodie/20200112194529.clean
A testdata/data/hudi_parquet/.hoodie/20200112194529.clean.inflight
A testdata/data/hudi_parquet/.hoodie/20200112194529.clean.requested
A testdata/data/hudi_parquet/.hoodie/20200112194529.commit
A testdata/data/hudi_parquet/.hoodie/20200112194529.commit.requested
A testdata/data/hudi_parquet/.hoodie/20200112194529.inflight
A testdata/data/hudi_parquet/.hoodie/hoodie.properties
A 
testdata/data/hudi_parquet/year=2015/month=03/day=16/.hoodie_partition_metadata
A 
testdata/data/hudi_parquet/year=2015/month=03/day=16/ca51fa17-681b-4497-85b7-4f68e7a63ee7-0_1-38-282_20200112194529.parquet
A 
testdata/data/hudi_parquet/year=2015/month=03/day=16/ca51fa17-681b-4497-85b7-4f68e7a63ee7-0_1-5-10_20200112194517.parquet
A 
testdata/data/hudi_parquet/year=2015/month=03/day=17/.hoodie_partition_metadata
A 
testdata/data/hudi_parquet/year=2015/month=03/day=17/45c9fa97-e514-41e8-91d2-6098e5995cdb-0_0-38-281_20200112194529.parquet
A 
testdata/data/hudi_parquet/year=2015/month=03/day=17/45c9fa97-e514-41e8-91d2-6098e5995cdb-0_0-5-9_20200112194517.parquet
A 
testdata/data/hudi_parquet/year=2016/month=03/day=15/.hoodie_partition_metadata
A 
testdata/data/hudi_parquet/year=2016/month=03/day=15/17dda230-e48a-4110-8c29-c613a3ac0b70-0_2-38-283_20200112194529.parquet
A 
testdata/data/hudi_parquet/year=2016/month=03/day=15/17dda230-e48a-4110-8c29-c613a3ac0b70-0_2-5-11_20200112194517.parquet
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A testdata/workloads/functional-query/queries/QueryTest/hudi-parquet.test
M tests/query_test/test_scanners.py
42 files changed, 624 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/14711/22
--
To view, visit http://gerrit.cloudera.org:8080/14711
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf
Gerrit-Change-Number: 14711
Gerrit-PatchSet: 22
Gerrit-Owner: Yanjia Gary Li 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yanjia Gary Li 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8778: Support Apache Hudi Read Optimized Table

2020-02-06 Thread Yanjia Gary Li (Code Review)
Yanjia Gary Li has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14711 )

Change subject: IMPALA-8778: Support Apache Hudi Read Optimized Table
..


Patch Set 21:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/14711/21/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/14711/21/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1561
PS21, Line 1561: (format == HdfsFileFormat.PARQUET) || (format == 
HdfsFileFormat.HUDI_PARQUET)
> can be replaced by isParquetBased(format)?
Done


http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/data/README@487
PS21, Line 487: file footer.
> file name?
Done


http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/datasets/functional/functional_schema_template.sql@2763
PS21, Line 2763: S
> You need to place a ';' at the end of all your statements, otherwise the ge
Done


http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/datasets/functional/functional_schema_template.sql@2776
PS21, Line 2776: LOCATION '/test-warehouse/hudi_parquet'
> Add ';'
Done


http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/datasets/functional/functional_schema_template.sql@2786
PS21, Line 2786: LOCATION '/test-warehouse/hudi_parquet'
> Add ';'
Done


http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/workloads/functional-query/queries/QueryTest/hudi-parquet.test
File testdata/workloads/functional-query/queries/QueryTest/hudi-parquet.test:

http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/workloads/functional-query/queries/QueryTest/hudi-parquet.test@71
PS21, Line 71: USE functional_parquet;
> nit: either put it at the beginning of the file and you don't need to spell
Done



--
To view, visit http://gerrit.cloudera.org:8080/14711
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf
Gerrit-Change-Number: 14711
Gerrit-PatchSet: 21
Gerrit-Owner: Yanjia Gary Li 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yanjia Gary Li 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 07 Feb 2020 04:35:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types

2020-02-06 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/15087 )

Change subject: IMPALA-8110: Fix the Parquet stats filtering issue to correctly 
handle narrowed integer types
..

IMPALA-8110: Fix the Parquet stats filtering issue to correctly
handle narrowed integer types

This patch adds validation for the paired stats values of tinyint
and smallint column data type when reading min/max column stats
value from Parquet file.

Testing:
 - Did Manual tests: create table with column as int type, intert
   some values, then alter table to change the column data type as
   tinyint (int8), insert more values, verify that the query return
   correct number of rows when PARQUET_READ_STATISTICS is set as 1.
   Did similar tests to change column data type from int to smallint,
   and from smallint to tinyint.
 - Added automatic test cases in parquet-stats.test for column data
   type been changed from int to tinyint, from smallint to tinyint
   and from int to smallint.
 - Passed EE tests.
 - Passed all core tests.

Change-Id: Id8bdaf4c4b2d0c6ea26d6e9bf013afca647e53a1
---
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/parquet-column-stats.cc
M be/src/exec/parquet/parquet-column-stats.h
M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
4 files changed, 124 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/15087/7
--
To view, visit http://gerrit.cloudera.org:8080/15087
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id8bdaf4c4b2d0c6ea26d6e9bf013afca647e53a1
Gerrit-Change-Number: 15087
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege.

2020-02-06 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15123 )

Change subject: IMPALA-7002: Throw AuthorizationException when user accessing 
non-existent table/database in CTE without any privilege.
..


Patch Set 6:

(1 comment)

Hi Bikram and Wenzhe, I left some comments for WithClause.java. Let me know if 
you have any thoughts. Thanks!

http://gerrit.cloudera.org:8080/#/c/15123/6/fe/src/main/java/org/apache/impala/analysis/WithClause.java
File fe/src/main/java/org/apache/impala/analysis/WithClause.java:

http://gerrit.cloudera.org:8080/#/c/15123/6/fe/src/main/java/org/apache/impala/analysis/WithClause.java@96
PS6, Line 96: finally {
:   // Register all privilege requests made from the root 
analyzer to the input
:   // analyzer so that caller could do authorization for all 
the requests collected
:   // during analysis and report an authorization error over 
an analysis error.
:   // We should not accidentally reveal the non-existence of a 
database/table if
:   // the user is not authorized.
:   for (PrivilegeRequest req : 
withClauseAnalyzer.getPrivilegeReqs()) {
: analyzer.registerPrivReq(req);
:   }
: }
After taking a closer look at this patch, I have a question about what we 
should do in this finally block. Specifically, currently I do not have a 
concrete answer to whether or not we should register all local views as well as 
record audit events in that finally block.

Recall that after analyze(stmtTableCache) at 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java#L415,
 authzChecker.authorize() will be invoked at 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java#L431.

After the fix in this patch, when a user is trying to access a non-existing 
table of which the corresponding privileges were not granted, 
authzChecker.authorize() would throw an AuthorizationException, which in turn 
will then be caught at
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java#L432.

Since AuthorizationExceptions take precedence over AnalysisExceptions according 
to 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java#L440-L441,
 it seems okay that we do not register all local views with 'analyzer' and not 
record audit events, because in the end the user will only see an 
AuthorizationException.

Maybe I am overthinking, but what is not that clear to me now is when there is 
a policy granting the user the SELECT privilege of that non-exisiting table. 
(Note that when the authorization provider is Ranger, it is possible to create 
such a policy via Ranger's web UI instead of executing a GRANT statement.) In 
this case, authzChecker.authorize() may not throw an AuthorizationException. If 
so, then I do not know whether or not we should register all local views and 
record audit events. I guess it depends on how we want Impala to behave in this 
very special case. Specifically, we may want to determine whether or not we 
would like Impala to store the information about local views and audit events 
in this special case.



--
To view, visit http://gerrit.cloudera.org:8080/15123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6b657a7147a136198a9a97a679c9131ee814577
Gerrit-Change-Number: 15123
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 07 Feb 2020 03:34:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9345: Skip TestOrc::test type conversions on S3

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15175 )

Change subject: IMPALA-9345: Skip TestOrc::test_type_conversions on S3
..

IMPALA-9345: Skip TestOrc::test_type_conversions on S3

The test was flaky due to timeouts in Hive queries.

Change-Id: Iaeb3a3c4d1a733c8a788a78a4b7163dde7599204
Reviewed-on: http://gerrit.cloudera.org:8080/15175
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M tests/query_test/test_scanners.py
1 file changed, 9 insertions(+), 0 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/15175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iaeb3a3c4d1a733c8a788a78a4b7163dde7599204
Gerrit-Change-Number: 15175
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9345: Skip TestOrc::test type conversions on S3

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15175 )

Change subject: IMPALA-9345: Skip TestOrc::test_type_conversions on S3
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaeb3a3c4d1a733c8a788a78a4b7163dde7599204
Gerrit-Change-Number: 15175
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Feb 2020 03:17:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8795 : Enable event polling by default in dockerized tests.

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14272 )

Change subject: IMPALA-8795 : Enable event polling by default in dockerized 
tests.
..


Patch Set 12: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5295/


--
To view, visit http://gerrit.cloudera.org:8080/14272
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I222b64236060b3c4c2d554e2f10e129984ebe972
Gerrit-Change-Number: 14272
Gerrit-PatchSet: 12
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 07 Feb 2020 01:56:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4224: execute separate join builds fragments

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14859 )

Change subject: IMPALA-4224: execute separate join builds fragments
..


Patch Set 37:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5149/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58
Gerrit-Change-Number: 14859
Gerrit-PatchSet: 37
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 07 Feb 2020 01:47:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4224: execute separate join builds fragments

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14859 )

Change subject: IMPALA-4224: execute separate join builds fragments
..


Patch Set 36:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5148/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58
Gerrit-Change-Number: 14859
Gerrit-PatchSet: 36
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 07 Feb 2020 01:45:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4224: execute separate join builds fragments

2020-02-06 Thread Tim Armstrong (Code Review)
Hello Zoltan Borok-Nagy, Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14859

to look at the new patch set (#37).

Change subject: IMPALA-4224: execute separate join builds fragments
..

IMPALA-4224: execute separate join builds fragments

This enables parallel plans with the join build in a
separate fragment and fixes all of the ensuing fallout.
After this change, mt_dop plans with joins have separate
build fragments. There is still a 1:1 relationship between
join nodes and builders, so the builders are only accessed
by the join node's thread after it is handed off. This lets
us defer the work required to make PhjBuilder and NljBuilder
safe to be shared between nodes.

Planner changes:
* Combined the parallel and distributed planning code paths.
* Misc fixes to generate reasonable thrift structures in the
  query exec requests, i.e. containing the right nodes.
* Fixes to resource calculations for the separate build plans.
** Calculate separate join/build resource consumption.
** Simplified the resource estimation by calculating resource
   consumption for each fragment separately, and assuming that
   all fragments hit their peak resource consumption at the
   same time. IMPALA-9255 is the follow-on to make the resource
   estimation more accurate.

Scheduler changes:
* Various fixes to handle multiple TPlanExecInfos correctly,
  which are generated by the planner for the different cohorts.
* Add logic to colocate build fragments with parent fragments.

Runtime filter changes:
* Build sinks now produce runtime filters, which required
  planner and coordinator fixes to handle.

DataSink changes:
* Close the input plan tree before calling FlushFinal() to release
  resources. This depends on Send() not holding onto references
  to input batches, which was true except for NljBuilder. This
  invariant is documented.

Join builder changes:
* Add a common base class for PhjBuilder and NljBuilder with
  functions to handle synchronisation with the join node.
* Close plan tree earlier in FragmentInstanceState::Exec()
  so that peak resource requirements are lower.
* The NLJ always copies input batches, so that it can close
  its input tree.

JoinNode changes:
* Join node blocks waiting for build-side to be ready,
  then eventually signals that it's done, allowing the builder
  to be cleaned up.
* NLJ and PHJ nodes handle both the integrated builder and
  the external builder. There is a 1:1 relationship between
  the node and the builder, so we don't deal with thread safety
  yet.
* Buffer reservations are transferred between the builder and join
  node when running with the separate builder. This is not really
  necessary right now, since it is all single-threaded, but will
  be important for the shared broadcast.
  - The builder transfers memory for probe buffers to the join node
at the end of each build phase.
  - At end of each probe phase, reservation needs to be handed back
to builder (or released).

ExecSummary changes:
* The summary logic was modified to handle connecting fragments
  via join builds. The logic is an extension of what was used
  for exchanges.

Testing:
* Enable --unlock_mt_dop for end-to-end tests
* Migrate some tests to run as part of end-to-end tests instead of
  custom cluster.
* Add mt_dop dimension to various end-to-end tests to provide
  coverage of join queries, spill-to-disk and cancellation.
* Ran a single node TPC-H and TPC-DS stress test with mt_dop=0
  and mt_dop=4.

Perf:
* Ran TPC-H scale factor 30 locally with mt_dop=0. No significant
  change.

Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.h
A be/src/exec/join-builder.cc
A be/src/exec/join-builder.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/initial-reservations.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/spillable-row-batch-queue.h
M be/src/util/summary-util.cc
M bin/run-all-tests.sh
M common/thrift/DataSinks.thrift
M 

[Impala-ASF-CR] IMPALA-4224: execute separate join builds fragments

2020-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14859 )

Change subject: IMPALA-4224: execute separate join builds fragments
..


Patch Set 35:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/14859/35//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14859/35//COMMIT_MSG@38
PS35, Line 38:accordingly.
> Something lost between edits.
Done


http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/blocking-join-node.h@63
PS35, Line 63: separate build fragment
> nit: maybe "separate but co-located build fragment"? I feel it's worth to e
Done


http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/blocking-join-node.h@243
PS35, Line 243: a
> nit: the integrated join builder
Done


http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/blocking-join-node.cc
File be/src/exec/blocking-join-node.cc:

http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/blocking-join-node.cc@196
PS35, Line 196:   if (!open_called_) {
> nit: maybe an early return would make it a bit more readable
Done


http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/join-builder.h
File be/src/exec/join-builder.h:

http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/join-builder.h@54
PS35, Line 54: class JoinBuilder : public DataSink {
> Maybe you could add a timeline, or sequence diagram-like thingy about the i
Done


http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/join-builder.h@138
PS35, Line 138: HandoffToProbe
> nit: how about HandoffToProbeAndWait?
Done (combined with Zoltan's suggestion)


http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/join-builder.cc
File be/src/exec/join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/join-builder.cc@90
PS35, Line 90: HandoffToProbe
> nit: HandoffToProbes (plural)?
Done


http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h@61
PS35, Line 61: embedded
> might be worth adding a DCHECK for all the embedded specific methods
I did a pass over the methods and I think I already had DCHECKs in all the 
places that made sense - there are various methods specific to the separate 
build. These constructors/factory methods imply whether it's embedded or not.

Maybe I missed something thouhg?


http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h@278
PS35, Line 278: as
> nit: is
Done


http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.h@637
PS35, Line 637: was
> nit: ways
Done


http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/14859/35/be/src/exec/partitioned-hash-join-builder.cc@611
PS35, Line 611:   if (buffer_pool_client_ != probe_client) {
> nit: how about we use is_separate_build_ as the if condition and then DCHEC
Good point, I like the suggestion. I updated the class comment to explain the 
invariant then added in DCHECKs.


http://gerrit.cloudera.org:8080/#/c/14859/35/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/14859/35/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@305
PS35, Line 305: Data sinks
> nit: Build sinks?
Done


http://gerrit.cloudera.org:8080/#/c/14859/35/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

http://gerrit.cloudera.org:8080/#/c/14859/35/fe/src/main/java/org/apache/impala/planner/Planner.java@115
PS35, Line 115: createDistributedPlan
> Yeah the naming is a bit confusing. I think there are two sets of concepts
Done


http://gerrit.cloudera.org:8080/#/c/14859/35/fe/src/main/java/org/apache/impala/planner/Planner.java@128
PS35, Line 128: // MT_DOP > 0 is not supported by default for plans with 
base table joins or table
  : // sinks: we only allow MT_DOP > 0 with such plans if 
--unlock_mt_dop=true is
  : // specified. We allow single node plans with mt_dop since 
there is no actual
  : // parallelism.
> nit: maybe the comment should placed in useParallelPlan() since it refers t
I think parts of this comment apply here - I simplified and shortened it.


http://gerrit.cloudera.org:8080/#/c/14859/35/tests/query_test/test_join_queries.py
File tests/query_test/test_join_queries.py:

http://gerrit.cloudera.org:8080/#/c/14859/35/tests/query_test/test_join_queries.py@35
PS35, Line 35:
> Would it add coverage if we keep 1 as well?
I think it's worthwhile, but I was somewhat 

[Impala-ASF-CR] IMPALA-4224: execute separate join builds fragments

2020-02-06 Thread Tim Armstrong (Code Review)
Hello Zoltan Borok-Nagy, Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14859

to look at the new patch set (#36).

Change subject: IMPALA-4224: execute separate join builds fragments
..

IMPALA-4224: execute separate join builds fragments

This enables parallel plans with the join build in a
separate fragment and fixes all of the ensuing fallout.
After this change, mt_dop plans with joins have separate
build fragments. There is still a 1:1 relationship between
join nodes and builders, so the builders are only accessed
by the join node's thread after it is handed off. This lets
us defer the work required to make PhjBuilder and NljBuilder
safe to be shared between nodes.

Planner changes:
* Combined the parallel and distributed planning code paths.
* Misc fixes to generate reasonable thrift structures in the
  query exec requests, i.e. containing the right nodes.
* Fixes to resource calculations for the separate build plans.
** Calculate separate join/build resource consumption.
** Simplified the resource estimation by calculating resource
   consumption for each fragment separately, and assuming that
   all fragments hit their peak resource consumption at the
   same time. IMPALA-9255 is the follow-on to make the resource
   estimation more accurate.

Scheduler changes:
* Various fixes to handle multiple TPlanExecInfos correctly,
  which are generated by the planner for the different cohorts.
* Add logic to colocate build fragments with parent fragments.

Runtime filter changes:
* Build sinks now produce runtime filters, which required
  planner and coordinator fixes to handle.

DataSink changes:
* Close the input plan tree before calling FlushFinal() to release
  resources. This depends on Send() not holding onto references
  to input batches, which was true except for NljBuilder. This
  invariant is documented.

Join builder changes:
* Add a common base class for PhjBuilder and NljBuilder with
  functions to handle synchronisation with the join node.
* Close plan tree earlier in FragmentInstanceState::Exec()
  so that peak resource requirements are lower.
* The NLJ always copies input batches, so that it can close
  its input tree.

JoinNode changes:
* Join node blocks waiting for build-side to be ready,
  then eventually signals that it's done, allowing the builder
  to be cleaned up.
* NLJ and PHJ nodes handle both the integrated builder and
  the external builder. There is a 1:1 relationship between
  the node and the builder, so we don't deal with thread safety
  yet.
* Buffer reservations are transferred between the builder and join
  node when running with the separate builder. This is not really
  necessary right now, since it is all single-threaded, but will
  be important for the shared broadcast.
  - The builder transfers memory for probe buffers to the join node
at the end of each build phase.
  - At end of each probe phase, reservation needs to be handed back
to builder (or released).

ExecSummary changes:
* The summary logic was modified to handle connecting fragments
  via join builds. The logic is an extension of what was used
  for exchanges.

Testing:
* Enable --unlock_mt_dop for end-to-end tests
* Migrate some tests to run as part of end-to-end tests instead of
  custom cluster.
* Add mt_dop dimension to various end-to-end tests to provide
  coverage of join queries, spill-to-disk and cancellation.
* Ran a single node TPC-H and TPC-DS stress test with mt_dop=0
  and mt_dop=4.

Perf:
* Ran TPC-H scale factor 30 locally with mt_dop=0. No significant
  change.

Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.h
A be/src/exec/join-builder.cc
A be/src/exec/join-builder.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/initial-reservations.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/spillable-row-batch-queue.h
M be/src/util/summary-util.cc
M bin/run-all-tests.sh
M common/thrift/DataSinks.thrift
M 

[Impala-ASF-CR] IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege.

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15123 )

Change subject: IMPALA-7002: Throw AuthorizationException when user accessing 
non-existent table/database in CTE without any privilege.
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5147/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6b657a7147a136198a9a97a679c9131ee814577
Gerrit-Change-Number: 15123
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 07 Feb 2020 00:18:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster

2020-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15159 )

Change subject: IMPALA-9361: manually configured kerberized minicluster
..


Patch Set 7:

This hit IMPALA-9152


--
To view, visit http://gerrit.cloudera.org:8080/15159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee
Gerrit-Change-Number: 15159
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Feb 2020 00:13:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15159 )

Change subject: IMPALA-9361: manually configured kerberized minicluster
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5297/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee
Gerrit-Change-Number: 15159
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Feb 2020 00:14:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9345: skip test type conversions on S3

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15177 )

Change subject: IMPALA-9345: skip test_type_conversions on S3
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5145/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0b6c24f0a3654a4805c7ab2019830c27443f2056
Gerrit-Change-Number: 15177
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Feb 2020 23:38:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15178 )

Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5146/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15178
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47
Gerrit-Change-Number: 15178
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 06 Feb 2020 23:38:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui

2020-02-06 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15178 )

Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui
..


Patch Set 1:

I posted an example screenshot on the JIRA


--
To view, visit http://gerrit.cloudera.org:8080/15178
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47
Gerrit-Change-Number: 15178
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 06 Feb 2020 23:34:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege.

2020-02-06 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/15123 )

Change subject: IMPALA-7002: Throw AuthorizationException when user accessing 
non-existent table/database in CTE without any privilege.
..

IMPALA-7002: Throw AuthorizationException when user accessing
non-existent table/database in CTE without any privilege.

Traced the issue and found that privilege requests collected during
analysis were lost in WithClause::analyze function when analysis
function throw AnalysisException. This caused authorization been
skipped and returned analysis error, instead of authorization error.
This patch register the privilege requests made from root analyzer
to the input analyzer in WithClause::analyze function regardless of
analysis exception.

Testing:
 - Manual test:
   - ran CTE with non-existent table/database in impala-shell
 without privilege, verified that it results in
 AuthorizationException.
   - ran CTE with non-existent table/database in impala-shell
 with the whole server privilege, verified that it results
 in AnalysisException.
 - Added CTE test cases for non-existent table/database in
   AuthorizationStmtTest.
 - Passed all FE tests.
 - Passed all core tests.

Change-Id: Ia6b657a7147a136198a9a97a679c9131ee814577
---
M fe/src/main/java/org/apache/impala/analysis/WithClause.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
2 files changed, 32 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/15123/6
--
To view, visit http://gerrit.cloudera.org:8080/15123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia6b657a7147a136198a9a97a679c9131ee814577
Gerrit-Change-Number: 15123
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343: Make impala-shell compatible with python 3.
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5144/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462
Gerrit-Change-Number: 15132
Gerrit-PatchSet: 7
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Feb 2020 23:26:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9243: Add info about blacklisting decisions to the webui

2020-02-06 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15178


Change subject: IMPALA-9243: Add info about blacklisting decisions to the webui
..

IMPALA-9243: Add info about blacklisting decisions to the webui

This patch adds information about blacklisting decisions to the
/backends webui endpoint.

For the JSON, it adds an 'is_blacklisted' field to all backends, and
for and backends where 'is_blacklisted' is true it adds a
'blacklist_cause' field indicating the error status that led to the
backend getting blacklisted and an 'blacklist_time_remaining' field
indiciating how much longer the backend will remain on the blacklist.
It also adds counts for the number of blacklisted and quiescing
backends, if any, and the number of active (i.e. all other) backends.

For display, in order to prevent the table of backend information from
having too many columns (prior to this patch it already had 12), it
separates blacklisted, quiescing, and active backends into three
separate table, with the blacklisted and quiescing tables only getting
displayed if there are any such backends.

Additionally, tooltips are added next to the headers for the
blacklisted and quiescing tables that provide a brief explanation of
what it means for a backend to appear on there lists.

Using separate tables also facilitates having state-specific columns -
the blacklisted table displays columns for the blacklist cause and
time remaining. Future work could consider adding columns to the
quiescing table, such as time until the grace period and deadline
expires.

Testing:
- Manually ran various quiescing/blacklisting scenarios and confirmed
  the /backends page displays as expected.
- Added cases to test_web_pages (to verify the new fields when nothing
  is blacklisted) and test_blacklist.

Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47
---
M be/src/runtime/coordinator.cc
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/scheduling/cluster-membership-mgr.h
M be/src/scheduling/executor-blacklist.cc
M be/src/scheduling/executor-blacklist.h
M be/src/service/impala-http-handler.cc
M bin/rat_exclude_files.txt
M tests/custom_cluster/test_blacklist.py
M tests/webserver/test_web_pages.py
M www/backends.tmpl
A www/blacklisted_tooltip.txt
A www/quiescing_tooltip.txt
12 files changed, 200 insertions(+), 22 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/15178/1
--
To view, visit http://gerrit.cloudera.org:8080/15178
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia0c309315b142a50be102dcb516b36ec6cb3cf47
Gerrit-Change-Number: 15178
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-9345: Skip TestOrc::test type conversions on S3

2020-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15175 )

Change subject: IMPALA-9345: Skip TestOrc::test_type_conversions on S3
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaeb3a3c4d1a733c8a788a78a4b7163dde7599204
Gerrit-Change-Number: 15175
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Feb 2020 22:54:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9345: Skip TestOrc::test type conversions on S3

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15175 )

Change subject: IMPALA-9345: Skip TestOrc::test_type_conversions on S3
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5296/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaeb3a3c4d1a733c8a788a78a4b7163dde7599204
Gerrit-Change-Number: 15175
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Feb 2020 22:55:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9345: skip test type conversions on S3

2020-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has abandoned this change. ( 
http://gerrit.cloudera.org:8080/15177 )

Change subject: IMPALA-9345: skip test_type_conversions on S3
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/15177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I0b6c24f0a3654a4805c7ab2019830c27443f2056
Gerrit-Change-Number: 15177
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9345: skip test type conversions on S3

2020-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15177


Change subject: IMPALA-9345: skip test_type_conversions on S3
..

IMPALA-9345: skip test_type_conversions on S3

The storage layer is orthogonal to the purpose of this test,
so let's disable it to unblock builds until we can understand
the problem and fix it properly.

Change-Id: I0b6c24f0a3654a4805c7ab2019830c27443f2056
---
M tests/common/skip.py
M tests/query_test/test_scanners.py
2 files changed, 3 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/15177/1
--
To view, visit http://gerrit.cloudera.org:8080/15177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0b6c24f0a3654a4805c7ab2019830c27443f2056
Gerrit-Change-Number: 15177
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.

2020-02-06 Thread David Knupp (Code Review)
David Knupp has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343: Make impala-shell compatible with python 3.
..


Patch Set 7:

I'm considering this patch ready for final review.

A couple of the shell tests won't totally work in a fully py2/py3 cross 
compatible way, and they were noted in the test files with comments.
These are:

- TestImpalaShellInteractive.test_fix_infinite_loop
- TestImpalaShell.test_international_characters_prettyprint_tabs

The next major shell patch should be to upgrade sqlparse. I filed IMPALA-9362
to track that work.


--
To view, visit http://gerrit.cloudera.org:8080/15132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462
Gerrit-Change-Number: 15132
Gerrit-PatchSet: 7
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Feb 2020 22:47:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343: Make impala-shell compatible with python 3.
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15132/7/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/15132/7/tests/shell/test_shell_commandline.py@1012
PS7, Line 1012:
flake8: W391 blank line at end of file



--
To view, visit http://gerrit.cloudera.org:8080/15132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462
Gerrit-Change-Number: 15132
Gerrit-PatchSet: 7
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Feb 2020 22:40:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3343: Make impala-shell compatible with python 3.

2020-02-06 Thread David Knupp (Code Review)
David Knupp has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343: Make impala-shell compatible with python 3.
..

IMPALA-3343: Make impala-shell compatible with python 3.

This is patch makes the impala-shell code cross-compatible with python 2 and
python 3. The goal is wind up with a version of the shell that will pass
python e2e tests repsective of the version of python used to launch the shell,
under the assumption that the test framework itself will continue to run with
python 2.7.x, irrespective of the shell version being tested.

There are a few isolated tests that weren't able to pass under both versions,
and the reasons have been documented in comments in the test themselves.

Notable changes for reviewers to consider:

- With regard to validating the patch, my assumption is that simply passing
  the existing set of e2e shell tests is sufficient to confirm that the shell
  is functioning properly. No new tests were added.

- Many of the simpler changes derive from the fact that a few built-in functions
  and/or types have either been removed or have else changed in python 3.x,
  E.g., xrange and basestring are both gone, dict.iteritems() has been removed,
  dict.items() behaves differently, the unicode() function and the method
  str.decode() have both been removed, etc.

  Also, catching exceptions using "Exception, e" no longer works, and (as most
  know), using print() as a function is required now.

- A new pytest command line option was added in conftest.py to enable a user
  to specify a path to an alternate impala-shell executable to test. It's
  possible to use this to point to an instance of the impala-shell that was
  installed as a standalone python package in a separate virtualenv.

  Example usage:
  USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=//bin/impala-shell -sv shell/test_shell_commandline.py

  The target virtualenv may be based on either python3 or python2. However,
  this has no effect on the version of python used to run the test framework,
  which remains tied to python 2.7.x for the foreseeable future.

- $IMPALA_HOME/bin/set-pythonpath.sh was updated to properly use the thrift-11
  gen-py files if USE_THRIFT11_GEN_PY is set to "true". This is required for
  testing a version of the impala-shell in a python3-based virtualenv.

- The wording of the header changed a bit to include the python version
  used to run the shell.

Starting Impala Shell on Python 3.7.5 with no authentication
Opened TCP connection to localhost:21000
...

OR

Starting Impala Shell on Python 2.7.12 with no authentication
Opened TCP connection to localhost:21000
...

- By far, the biggest hassle has been juggling str versus unicode versus
  bytes data types. Python 2.x was fairly loose and inconsistent in
  how it dealt with strings. As a quick demo of what I mean:

  Python 2.7.12 (default, Nov 12 2018, 14:36:49)
  [GCC 5.4.0 20160609] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> d = 'like a duck'
  >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == 
d.decode('utf-8')
  True

  ...and yet there are weird unexpected gotchas.

  >>> d.decode('utf-8') == d.encode('utf-8')
  True
  >>> d.encode('utf-8') == bytearray(d, 'utf-8')
  True
  >>> d.decode('utf-8') == bytearray(d, 'utf-8')   # fails the eq property?
  False

  As a result of this, the way we handled strings in the impala-shell code had
  become equally loose and inconsistent -- mainly in the form of frequent and
  liberal use of str.encode() and str.decode() -- but things still just worked.

  In python3, there's a much clearer distinction between strings and bytes, and
  as such, much tighter type consistency is expected by standard libs like
  subprocess, re, sqlparse, prettytable, etc., which are used throughout the
  shell. Even simple calls that worked in python 2.x:

  >>> import re
  >>> re.findall('foo', b'foobar')
  ['foo']

  ...can throw exceptions in python 3.x:

  >>> import re
  >>> re.findall('foo', b'foobar')
  Traceback (most recent call last):
File "", line 1, in 
File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall
  return _compile(pattern, flags).findall(string)
  TypeError: cannot use a string pattern on a bytes-like object

  Exceptions like this resulted in a many, if not most shell tests failing
  under python 3.

  At first, I tried to go one-by-one to the site of each failure, and correct
  by checking instance type and re-encoding as necessary, but this only led to
  even more str.encode() calls littering the code, which just seemed like a
  code-smell. (Wiki "code smell" if you don't know the term.)

  What ultimately seemed like a better approach was to try to weed out as many
  existing spurious str.encode() and str.decode() calls as I could. I'm not sure
  I got 

[Impala-ASF-CR] IMPALA-8795 : Enable event polling by default in dockerized tests.

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14272 )

Change subject: IMPALA-8795 : Enable event polling by default in dockerized 
tests.
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5143/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14272
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I222b64236060b3c4c2d554e2f10e129984ebe972
Gerrit-Change-Number: 14272
Gerrit-PatchSet: 12
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 06 Feb 2020 21:53:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8795 : Enable event polling by default in dockerized tests.

2020-02-06 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/14272 )

Change subject: IMPALA-8795 : Enable event polling by default in dockerized 
tests.
..

IMPALA-8795 : Enable event polling by default in dockerized tests.

Most of the fixes for event processing are committed. This change
enables the feature by default for all the tests so that eventually we
can turn it on out of the box.

Testing [WIP]:
1. Ran core tests with USE_CDP_HIVE=true
2. Running exhaustive tests with USE_CDP_HIVE=false

Change-Id: I222b64236060b3c4c2d554e2f10e129984ebe972
---
M docker/catalogd/Dockerfile
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/environ.py
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_event_processing.py
M tests/custom_cluster/test_hive_parquet_codec_interop.py
M tests/metadata/test_hms_integration.py
M tests/metadata/test_metadata_query_statements.py
M tests/metadata/test_refresh_partition.py
M tests/util/event_processor_utils.py
12 files changed, 204 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/14272/12
--
To view, visit http://gerrit.cloudera.org:8080/14272
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I222b64236060b3c4c2d554e2f10e129984ebe972
Gerrit-Change-Number: 14272
Gerrit-PatchSet: 12
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9242: Filter privileges before returning them to Sentry

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15068 )

Change subject: IMPALA-9242: Filter privileges before returning them to Sentry
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5142/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iecd4281368d1c9fe88cfe850ea725cd68895712e
Gerrit-Change-Number: 15068
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 06 Feb 2020 20:39:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9345: Skip TestOrc::test type conversions on S3

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15175 )

Change subject: IMPALA-9345: Skip TestOrc::test_type_conversions on S3
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5141/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaeb3a3c4d1a733c8a788a78a4b7163dde7599204
Gerrit-Change-Number: 15175
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Feb 2020 20:30:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8778: Support Apache Hudi Read Optimized Table

2020-02-06 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14711 )

Change subject: IMPALA-8778: Support Apache Hudi Read Optimized Table
..


Patch Set 21: Code-Review+1

(1 comment)

One minor comment, otherwise LGTM

http://gerrit.cloudera.org:8080/#/c/14711/21/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/14711/21/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1561
PS21, Line 1561: (format == HdfsFileFormat.PARQUET) || (format == 
HdfsFileFormat.HUDI_PARQUET)
can be replaced by isParquetBased(format)?



--
To view, visit http://gerrit.cloudera.org:8080/14711
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf
Gerrit-Change-Number: 14711
Gerrit-PatchSet: 21
Gerrit-Owner: Yanjia Gary Li 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yanjia Gary Li 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 06 Feb 2020 20:14:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8110: Fix the Parquet stats filtering issue to correctly handle narrowed integer types

2020-02-06 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15087 )

Change subject: IMPALA-8110: Fix the Parquet stats filtering issue to correctly 
handle narrowed integer types
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15087/5/be/src/exec/parquet/parquet-column-stats.cc
File be/src/exec/parquet/parquet-column-stats.cc:

http://gerrit.cloudera.org:8080/#/c/15087/5/be/src/exec/parquet/parquet-column-stats.cc@116
PS5, Line 116: ic_cast
> Will change code to return true only when both the stats value and its pair
will remove boolean variable paired_stats_val_is_set, and return false earlier 
if ret is false, or paired_stats_value is null.



--
To view, visit http://gerrit.cloudera.org:8080/15087
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8bdaf4c4b2d0c6ea26d6e9bf013afca647e53a1
Gerrit-Change-Number: 15087
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 06 Feb 2020 19:59:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9345: Skip TestOrc::test type conversions on S3

2020-02-06 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15175


Change subject: IMPALA-9345: Skip TestOrc::test_type_conversions on S3
..

IMPALA-9345: Skip TestOrc::test_type_conversions on S3

The test was flaky due to timeouts in Hive queries.

Change-Id: Iaeb3a3c4d1a733c8a788a78a4b7163dde7599204
---
M tests/query_test/test_scanners.py
1 file changed, 9 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/15175/1
--
To view, visit http://gerrit.cloudera.org:8080/15175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iaeb3a3c4d1a733c8a788a78a4b7163dde7599204
Gerrit-Change-Number: 15175
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 


[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15159 )

Change subject: IMPALA-9361: manually configured kerberized minicluster
..


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5293/


--
To view, visit http://gerrit.cloudera.org:8080/15159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee
Gerrit-Change-Number: 15159
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Feb 2020 19:43:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9242: Filter privileges before returning them to Sentry

2020-02-06 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15068 )

Change subject: IMPALA-9242: Filter privileges before returning them to Sentry
..


Patch Set 8:

I wanted to upload the unit tests first and then fix/answer the other. Then I 
realized that I messed the update logic up, so I need some more work before a 
proper patch.

The failed jenkins code review check seems unrelated.


--
To view, visit http://gerrit.cloudera.org:8080/15068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iecd4281368d1c9fe88cfe850ea725cd68895712e
Gerrit-Change-Number: 15068
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 06 Feb 2020 19:09:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8852: Skipping short-circuit config check for coordinator only

2020-02-06 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15173 )

Change subject: IMPALA-8852: Skipping short-circuit config check for 
coordinator only
..


Patch Set 2:

(5 comments)

Thanks for the change. This looks good.
In the Jira Lars says "As a permanent solution we should only emit a warning 
when the socket cannot be found and -is_executor=false". This solution is 
different in that it just avoids the check. That is probably OK but can you 
explain why you're not doing what Lars said.

http://gerrit.cloudera.org:8080/#/c/15173/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15173/2//COMMIT_MSG@10
PS2, Line 10: DataNode is not avaiable on the hosts. This change adds a 
condition to
Nit: typo: available


http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java
File fe/src/main/java/org/apache/impala/service/JniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@723
PS2, Line 723: if (!BackendConfig.INSTANCE.getBackendCfg().is_executor) {
Should this check move after the test for DFS_CLIENT_READ_SHORTCIRCUIT_KEY?
If short-circuit reads are not enabled then this message might be confusing?


http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@724
PS2, Line 724:   LOG.info("Coordinator only instance will not read local 
data via " +
Nit "Coordinator-only" is clearer I think


http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@725
PS2, Line 725:   "short-circuit reads.");
We use clang-format to format Java code (as well as C++)
https://cwiki.apache.org/confluence/display/IMPALA/Impala+Style+Guide
You can get a plugin for IntelliJ which makes it easy to do.
It finds a few small nits in your code layout.


http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
File fe/src/test/java/org/apache/impala/service/JniFrontendTest.java:

http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@68
PS2, Line 68: // GIVEN
I wasn't familiar with this given-when-then style but I googled it and learned 
something. As I'm not used to it I found it a little jarring as this is the 
only use of it I have seen in the Impala codebase



--
To view, visit http://gerrit.cloudera.org:8080/15173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I373d4037f4cee203322a398b77b75810ba708bb5
Gerrit-Change-Number: 15173
Gerrit-PatchSet: 2
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Feb 2020 18:55:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8605: clean up HS2/beeswax session management

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13452 )

Change subject: IMPALA-8605: clean up HS2/beeswax session management
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5140/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib32927d803df7c27947960e9133e7e458ccc39bf
Gerrit-Change-Number: 13452
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Feb 2020 17:47:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8852: Skipping short-circuit config check for coordinator only

2020-02-06 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15173 )

Change subject: IMPALA-8852: Skipping short-circuit config check for 
coordinator only
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
File fe/src/test/java/org/apache/impala/service/JniFrontendTest.java:

http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@66
PS2, Line 66: testCheckShortCircuitReadShouldReturnErrorForInvalidConfig()
Would it be possible to consolidate this and the next into a single test? Say 
something like "testCheckShortCircuitConfigs()".



--
To view, visit http://gerrit.cloudera.org:8080/15173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I373d4037f4cee203322a398b77b75810ba708bb5
Gerrit-Change-Number: 15173
Gerrit-PatchSet: 2
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Feb 2020 17:42:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8852: Skipping short-circuit config check for coordinator only

2020-02-06 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15173 )

Change subject: IMPALA-8852: Skipping short-circuit config check for 
coordinator only
..


Patch Set 2:

(5 comments)

Thanks for the change. Just a few nits.

http://gerrit.cloudera.org:8080/#/c/15173/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15173/2//COMMIT_MSG@7
PS2, Line 7: Skipping
Nit: Change to Skip?


http://gerrit.cloudera.org:8080/#/c/15173/2//COMMIT_MSG@13
PS2, Line 13:
Usually, our commit messages also have a section "Testing" to mention the tests 
added by this commit. This will help reviewers understand the testing done for 
the patch. See this for example. https://gerrit.cloudera.org/#/c/15157/

You could add something like:
Added tests in JNIFrontendTest.java to verify the short-circuit check is 
skipped if it is coordinator-only mode.


http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java
File fe/src/main/java/org/apache/impala/service/JniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@719
PS2, Line 719: ,
Could you also mention here that this will skip the check if it is 
coordinator-only mode?


http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@806
PS2, Line 806:
Nit: Extra space.


http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
File fe/src/test/java/org/apache/impala/service/JniFrontendTest.java:

http://gerrit.cloudera.org:8080/#/c/15173/2/fe/src/test/java/org/apache/impala/service/JniFrontendTest.java@90
PS2, Line 90: conf.setStrings("dfs.domain.socket.path", "");
Don't we need to set dfs.client.read.shortcircuit = true here? Is the default 
true?



--
To view, visit http://gerrit.cloudera.org:8080/15173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I373d4037f4cee203322a398b77b75810ba708bb5
Gerrit-Change-Number: 15173
Gerrit-PatchSet: 2
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Feb 2020 17:38:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8605: clean up HS2/beeswax session management

2020-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13452


Change subject: IMPALA-8605: clean up HS2/beeswax session management
..

IMPALA-8605: clean up HS2/beeswax session management

Session/operation secrets are part of the HS2 handle
but we haven't made use of them up until now. This
patch checks the value and treats it as part of the
session key, as originally intended. I.e. if the
secret is missing, the session lookup fails.

The operation secret is the same as the session secret
to save having to generate and store extra secrets
(there's no real benefit).

A secret is added to each Beeswax session. This secret is
internal to the server and not exposed. Adds validation
that client requests accessed via Beeswax belong to the
same user as the session.

We switch uuid_generator_ to use boost::random_device, which
uses /dev/urandom as its source of randomness. We
rely on the IDs being unique and the non-cryptographic
generator might not be strong enough.

For requests - GetRuntimeProfile() and GetExecSummary()
that provide both a session and query ID, the code already
checks that the session's user matches the query.

An exception to the validation mechanisms above is added
for Close() and Cancel() beeswax operations, because impala-shell
and some administrative tools allow cancellation of
queries on different threads and from different tools.

We skip validating the session secret when cancelling queries
from the web UI, since web UI users don't have the secret.

Testing:
* Add tests for all HS2 RPCs that provide invalid session secrets.
* Add tests for HS2 RPCs that provide both session and query
  ID to ensure that query belongs to the session.
* Add basic test for beeswax testing accessing a query from
  different connections.

Change-Id: Ib32927d803df7c27947960e9133e7e458ccc39bf
---
M CMakeLists.txt
M be/src/service/child-query.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M tests/beeswax/impala_beeswax.py
M tests/common/impala_connection.py
M tests/hs2/hs2_test_suite.py
M tests/hs2/test_fetch.py
M tests/hs2/test_hs2.py
A tests/query_test/test_beeswax.py
13 files changed, 688 insertions(+), 133 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13452/9
--
To view, visit http://gerrit.cloudera.org:8080/13452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib32927d803df7c27947960e9133e7e458ccc39bf
Gerrit-Change-Number: 13452
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8605: clean up HS2/beeswax session management

2020-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has abandoned this change. ( 
http://gerrit.cloudera.org:8080/13452 )

Change subject: IMPALA-8605: clean up HS2/beeswax session management
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/13452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ib32927d803df7c27947960e9133e7e458ccc39bf
Gerrit-Change-Number: 13452
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8778: Support Apache Hudi Read Optimized Table

2020-02-06 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14711 )

Change subject: IMPALA-8778: Support Apache Hudi Read Optimized Table
..


Patch Set 21: Code-Review+1

(5 comments)

Found some minor issues, but other than that looks good do me.

Once the issues are fixed, and the change looks good to Sahil or Tim as well, 
I'm ready to give it a +2.

http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/data/README@487
PS21, Line 487: file footer.
file name?


http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/datasets/functional/functional_schema_template.sql@2763
PS21, Line 2763: S
You need to place a ';' at the end of all your statements, otherwise the 
generated SQL file will be syntactically incorrect when you want to load more 
tables, or the whole workload, e.g.:

 ./bin/load-data.py -w functional-query  --table_formats=parquet/none/none


http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/datasets/functional/functional_schema_template.sql@2776
PS21, Line 2776: LOCATION '/test-warehouse/hudi_parquet'
Add ';'


http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/datasets/functional/functional_schema_template.sql@2786
PS21, Line 2786: LOCATION '/test-warehouse/hudi_parquet'
Add ';'


http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/workloads/functional-query/queries/QueryTest/hudi-parquet.test
File testdata/workloads/functional-query/queries/QueryTest/hudi-parquet.test:

http://gerrit.cloudera.org:8080/#/c/14711/21/testdata/workloads/functional-query/queries/QueryTest/hudi-parquet.test@71
PS21, Line 71: USE functional_parquet;
nit: either put it at the beginning of the file and you don't need to spell 
'functional_parquet.' in each query, or use the fully qualified table names in 
this query as well



--
To view, visit http://gerrit.cloudera.org:8080/14711
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf
Gerrit-Change-Number: 14711
Gerrit-PatchSet: 21
Gerrit-Owner: Yanjia Gary Li 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yanjia Gary Li 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 06 Feb 2020 15:19:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster

2020-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15159 )

Change subject: IMPALA-9361: manually configured kerberized minicluster
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee
Gerrit-Change-Number: 15159
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Feb 2020 14:54:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9361: manually configured kerberized minicluster

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15159 )

Change subject: IMPALA-9361: manually configured kerberized minicluster
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5293/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib34101d132e9c9d59da14537edf7d096f25e9bee
Gerrit-Change-Number: 15159
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Feb 2020 14:54:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9242: Filter privileges before returning them to Sentry

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15068 )

Change subject: IMPALA-9242: Filter privileges before returning them to Sentry
..


Patch Set 8:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5139/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iecd4281368d1c9fe88cfe850ea725cd68895712e
Gerrit-Change-Number: 15068
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 06 Feb 2020 14:34:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8852: Skipping short-circuit config check for coordinator only

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15173 )

Change subject: IMPALA-8852: Skipping short-circuit config check for 
coordinator only
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5138/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I373d4037f4cee203322a398b77b75810ba708bb5
Gerrit-Change-Number: 15173
Gerrit-PatchSet: 2
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Feb 2020 14:16:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9242: Filter privileges before returning them to Sentry

2020-02-06 Thread Csaba Ringhofer (Code Review)
Hello Anonymous Coward (498), Vihang Karajgaonkar, Daniel Becker, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15068

to look at the new patch set (#8).

Change subject: IMPALA-9242: Filter privileges before returning them to Sentry
..

IMPALA-9242: Filter privileges before returning them to Sentry

This change implements the new FilteredPrivilegeCache, which adds
functions for filtering privileges based on the authorizable and
for returning Privileges directly instead of their String form.

The filtering is based on server + db + table (or just server in
case of URI privileges) to filter out the bulk of unrelated privileges.
Efficient filtering is done by a new class PrincipalPrivilegeTree.
It was tempting to reuse Sentry's TreePrivilegeCache, which has a very
similar role, but it lacks a "remove" function that is needed to keep
this index in sync with the CatalogObjectCache in Principal. I am also
a bit concerned about the possible side effect of Sentry's interning
of names in privileges - we try to avoid using String.intern() on
massive amount of names in Impala.

Testing:
- added unit tests based on Sentry / TestTreePrivilegeCache

Change-Id: Iecd4281368d1c9fe88cfe850ea725cd68895712e
---
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/Principal.java
A fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilegeTree.java
A fe/src/test/java/org/apache/impala/catalog/PrincipalPrivilegeTreeTest.java
4 files changed, 467 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/15068/8
--
To view, visit http://gerrit.cloudera.org:8080/15068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iecd4281368d1c9fe88cfe850ea725cd68895712e
Gerrit-Change-Number: 15068
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Anonymous Coward (498)
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-8852: Skipping short-circuit config check for coordinator only

2020-02-06 Thread Tamas Mate (Code Review)
Tamas Mate has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15173


Change subject: IMPALA-8852: Skipping short-circuit config check for 
coordinator only
..

IMPALA-8852: Skipping short-circuit config check for coordinator only

ImpalaD should not abort when running in Coordinator only mode and
DataNode is not avaiable on the hosts. This change adds a condition to
skip the short-circuit configuration validations when ImpalaD is started
with 'is_executor=false' flag.

Change-Id: I373d4037f4cee203322a398b77b75810ba708bb5
---
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
2 files changed, 50 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/15173/2
--
To view, visit http://gerrit.cloudera.org:8080/15173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I373d4037f4cee203322a398b77b75810ba708bb5
Gerrit-Change-Number: 15173
Gerrit-PatchSet: 2
Gerrit-Owner: Tamas Mate 


[Impala-ASF-CR] IMPALA-9287: Add support for embedded HMS in CDP builds

2020-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15057 )

Change subject: IMPALA-9287: Add support for embedded HMS in CDP builds
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5137/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
Gerrit-Change-Number: 15057
Gerrit-PatchSet: 12
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 06 Feb 2020 09:38:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9287: Add support for embedded HMS in CDP builds

2020-02-06 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15057 )

Change subject: IMPALA-9287: Add support for embedded HMS in CDP builds
..


Patch Set 12:

> (1 comment)

The error is: cannot found class PartitionExpressionForMetastore. I add class 
one by one to keep pom files minimal, all including four classed.


--
To view, visit http://gerrit.cloudera.org:8080/15057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
Gerrit-Change-Number: 15057
Gerrit-PatchSet: 12
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 06 Feb 2020 08:56:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9287: Add support for embedded HMS in CDP builds

2020-02-06 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15057 )

Change subject: IMPALA-9287: Add support for embedded HMS in CDP builds
..


Patch Set 12:

(4 comments)

Thanks for your review. And I've already fixed this as possible.

http://gerrit.cloudera.org:8080/#/c/15057/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15057/10//COMMIT_MSG@7
PS10, Line 7: IMPALA-9287: Add support for embedded HMS in CDP builds
> I suggest modify the title of the JIRA which is more appropriate. Something
Done


http://gerrit.cloudera.org:8080/#/c/15057/10/fe/pom.xml
File fe/pom.xml:

http://gerrit.cloudera.org:8080/#/c/15057/10/fe/pom.xml@1210
PS10, Line 1210: 
> Since we are removing the test scope this comment can be edited/removed.
Done


http://gerrit.cloudera.org:8080/#/c/15057/10/fe/pom.xml@1214
PS10, Line 1214:   3.2.0-m3
> please add a runtime here, along with the comment pointing t
Done


http://gerrit.cloudera.org:8080/#/c/15057/10/fe/pom.xml@1233
PS10, Line 1233:   org.apache.hive
> can you add runtime here? Also, please add a comment as to w
Done



--
To view, visit http://gerrit.cloudera.org:8080/15057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
Gerrit-Change-Number: 15057
Gerrit-PatchSet: 12
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 06 Feb 2020 08:54:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9287: Add support for embedded HMS in CDP builds

2020-02-06 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15057 )

Change subject: IMPALA-9287: Add support for embedded HMS in CDP builds
..


Patch Set 12:

> (1 comment)
 >
 > I ran a test with USE_CDP_HIVE=true and it passed:
 > https://jenkins.impala.io/job/ubuntu-16.04-from-scratch-cdp-hive/60/
 >
 > I think we just need to make sure the addition in pom files is
 > minimal.

Thanks for ran test, Quanlong. I've already reduce some unnecessary classes to 
keep pom files minimal.


--
To view, visit http://gerrit.cloudera.org:8080/15057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
Gerrit-Change-Number: 15057
Gerrit-PatchSet: 12
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 06 Feb 2020 08:52:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9287: Add support for embedded HMS in CDP builds

2020-02-06 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/15057 )

Change subject: IMPALA-9287: Add support for embedded HMS in CDP builds
..

IMPALA-9287: Add support for embedded HMS in CDP builds

In some situations, an embedded HMS is enough for catalogd server.
And we've already implemented this in IMPALA-8974. But after
setting USE_CDP_HIVE=true and rebuilt impala, the custom cluster
test case test_kudu_table_create_without_hms would failed due to
lacking of related jars. The solution is to add related maven
dependency in $IMPALA_HOME/fe/pom.xml and
$IMPALA_HOME/shaded-deps/pom.xml.

Tests:
  * Ran test_kudu_table_create_without_hms.py by setting
  USE_CDP_HIVE=true locally

Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
---
M bin/bootstrap_system.sh
M fe/pom.xml
M shaded-deps/pom.xml
M tests/custom_cluster/test_kudu_table_create_without_hms.py
4 files changed, 37 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/15057/12
--
To view, visit http://gerrit.cloudera.org:8080/15057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
Gerrit-Change-Number: 15057
Gerrit-PatchSet: 12
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng