[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..

IMPALA-10152: Add support for Iceberg HiveCatalog

HiveCatalog is one of Iceberg's catalog implementations. It uses
the Hive metastore and it is the recommended catalog implementation
when the table data is stored in object stores like S3.

This commit updates the Iceberg version to a newer one, and it also
retrieves Iceberg from the CDP distribution because that version of
Iceberg is built against Hive 3 (Impala is only compatible with
Hive 3).

This commit makes HiveCatalog the default Iceberg catalog in Impala
because it can be used in more environments (e.g. cloud stores),
and it is more featureful. Also, other engines that store their
table metadata in HMS will probably use HiveCatalog as well.

Tables stored in HiveCatalog are similar to Kudu tables with HMS
integration, i.e. modifying an Iceberg table via the Iceberg APIs
also modifies the HMS table. So in CatalogOpExecutor we handle
such Iceberg tables similarly to integrated Kudu tables.

Testing:
 * Added e2e tests for creating, writing, and altering Iceberg
   tables
 * Added SHOW CREATE TABLE tests

Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Reviewed-on: http://gerrit.cloudera.org:8080/16721
Reviewed-by: wangsheng 
Tested-by: Impala Public Jenkins 
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
16 files changed, 577 insertions(+), 92 deletions(-)

Approvals:
  wangsheng: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 20 Nov 2020 20:39:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 7:

Thanks for the review! After +2 we can run the verify job with DRY_RUN=false, 
so on success the job submits the patch.


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 20 Nov 2020 15:18:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6686/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 20 Nov 2020 15:10:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 7: Code-Review+2

Thanks for this new feature, Zoltan, LGTM!


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 20 Nov 2020 15:10:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7698/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 20 Nov 2020 13:40:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 7:

(2 comments)

Thanks for the quick review!

http://gerrit.cloudera.org:8080/#/c/16721/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/16721/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@84
PS6, Line 84:   // Internal Iceberg table property that specifies the absolute 
path of the current
:   // table metadata.
> We may add some explain here: this propery is only valid for 'hive.catalog'
Done


http://gerrit.cloudera.org:8080/#/c/16721/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test:

http://gerrit.cloudera.org:8080/#/c/16721/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test@127
PS6, Line 127: CREATE TABLE iceberg_hadoop_cat_with_metadata_locacti
> Shall we add a test for HadoopCatalog here?
Done



--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 20 Nov 2020 13:18:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16721

to look at the new patch set (#7).

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..

IMPALA-10152: Add support for Iceberg HiveCatalog

HiveCatalog is one of Iceberg's catalog implementations. It uses
the Hive metastore and it is the recommended catalog implementation
when the table data is stored in object stores like S3.

This commit updates the Iceberg version to a newer one, and it also
retrieves Iceberg from the CDP distribution because that version of
Iceberg is built against Hive 3 (Impala is only compatible with
Hive 3).

This commit makes HiveCatalog the default Iceberg catalog in Impala
because it can be used in more environments (e.g. cloud stores),
and it is more featureful. Also, other engines that store their
table metadata in HMS will probably use HiveCatalog as well.

Tables stored in HiveCatalog are similar to Kudu tables with HMS
integration, i.e. modifying an Iceberg table via the Iceberg APIs
also modifies the HMS table. So in CatalogOpExecutor we handle
such Iceberg tables similarly to integrated Kudu tables.

Testing:
 * Added e2e tests for creating, writing, and altering Iceberg
   tables
 * Added SHOW CREATE TABLE tests

Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
16 files changed, 577 insertions(+), 92 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/7
--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 6:

(2 comments)

Thanks for a quick turnaround, just two nits.

http://gerrit.cloudera.org:8080/#/c/16721/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/16721/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@84
PS6, Line 84:   // Internal Iceberg table property that specifies the absolute 
path of the current
:   // table metadata.
We may add some explain here: this propery is only valid for 'hive.catalog' or 
'HiveCatalog'


http://gerrit.cloudera.org:8080/#/c/16721/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test:

http://gerrit.cloudera.org:8080/#/c/16721/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test@127
PS6, Line 127: CREATE TABLE iceberg_hive_cat_with_metadata_locaction
Shall we add a test for HadoopCatalog here?



--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 20 Nov 2020 11:22:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7695/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 20 Nov 2020 10:05:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7694/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 20 Nov 2020 10:04:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 6:

(3 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java:

http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@419
PS4, Line 419: cebergTable.METAD
> This table property is generated by HiveCatalog, I'm curious about:
Good point! 'metadata_location' is internal to Iceberg so we shouldn't allow 
users modifying it.

Updated the code accordingly.


http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@65
PS4, Line 65: return hiveCatalog_.createTable(identifier, schema, spec, 
location, properties);
:   }
> nits: One line is ok, unnecessary for two lines.
Done


http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
File fe/src/main/java/org/apache/impala/util/IcebergUtil.java:

http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@242
PS4, Line 242:* Get TIcebergFileFormat from a string, usually from table 
properties.
 :*
> Maybe we need add a comment here, since we change the default value to 'PAR
Updated the code a bit. Now this method is returning PARQUET when 'format' is 
null. Format can be null when the table was created by other engines. And 
returning null when the format string is invalid.



--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 20 Nov 2020 09:45:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16721

to look at the new patch set (#6).

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..

IMPALA-10152: Add support for Iceberg HiveCatalog

HiveCatalog is one of Iceberg's catalog implementations. It uses
the Hive metastore and it is the recommended catalog implementation
when the table data is stored in object stores like S3.

This commit updates the Iceberg version to a newer one, and it also
retrieves Iceberg from the CDP distribution because that version of
Iceberg is built against Hive 3 (Impala is only compatible with
Hive 3).

This commit makes HiveCatalog the default Iceberg catalog in Impala
because it can be used in more environments (e.g. cloud stores),
and it is more featureful. Also, other engines that store their
table metadata in HMS will probably use HiveCatalog as well.

Tables stored in HiveCatalog are similar to Kudu tables with HMS
integration, i.e. modifying an Iceberg table via the Iceberg APIs
also modifies the HMS table. So in CatalogOpExecutor we handle
such Iceberg tables similarly to integrated Kudu tables.

Testing:
 * Added e2e tests for creating, writing, and altering Iceberg
   tables
 * Added SHOW CREATE TABLE tests

Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
16 files changed, 568 insertions(+), 92 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/6
--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16721

to look at the new patch set (#5).

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..

IMPALA-10152: Add support for Iceberg HiveCatalog

HiveCatalog is one of Iceberg's catalog implementations. It uses
the Hive metastore and it is the recommended catalog implementation
when the table data is stored in object stores like S3.

This commit updates the Iceberg version to a newer one, and it also
retrieves Iceberg from the CDP distribution because that version of
Iceberg is built against Hive 3 (Impala is only compatible with
Hive 3).

This commit makes HiveCatalog the default Iceberg catalog in Impala
because it can be used in more environments (e.g. cloud stores),
and it is more featureful. Also, other engines that store their
table metadata in HMS will probably use HiveCatalog as well.

Tables stored in HiveCatalog are similar to Kudu tables with HMS
integration, i.e. modifying an Iceberg table via the Iceberg APIs
also modifies the HMS table. So in CatalogOpExecutor we handle
such Iceberg tables similarly to integrated Kudu tables.

Testing:
 * Added e2e tests for creating, writing, and altering Iceberg
   tables
 * Added SHOW CREATE TABLE tests

Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
16 files changed, 568 insertions(+), 92 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/5
--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-20 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java:

http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@419
PS4, Line 419: metadata_location
This table property is generated by HiveCatalog, I'm curious about:
1. Can we set this table property when creating table? If not, I think we need 
to add some check in code;
2. Can we alter table to set this table property? If not, we also need to add 
some check in code;
3. Maybe we should define a static variable in IcebergTable.java, just like 
ICEBERG_CATALOG, and we can use a reference here.

As far as I know, 'metadata_location' is refer to a metadata file's absolute 
path, and maybe we cannot modify this property manually.


http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@65
PS4, Line 65: return hiveCatalog_.createTable(identifier, schema, spec, 
location,
: properties);
nits: One line is ok, unnecessary for two lines.


http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
File fe/src/main/java/org/apache/impala/util/IcebergUtil.java:

http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@242
PS4, Line 242:* Get TIcebergFileFormat from a string, usually from table 
properties
 :*/
Maybe we need add a comment here, since we change the default value to 
'PARQUET' to replace 'null'



--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 20 Nov 2020 07:59:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-19 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 4: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2012
PS2, Line 2012: ncompleteTable && isSynchronizedIceber
> We still need to invoke HMS dropTable for synchronized tables that don't ha
I am also unsure about this scenario, but I preferred not to change the 
original handling.



--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 19 Nov 2020 15:10:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7688/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 19 Nov 2020 11:58:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7687/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 19 Nov 2020 11:48:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-19 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 4:

PS4 is only a rebase.


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 19 Nov 2020 11:42:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-19 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16721

to look at the new patch set (#4).

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..

IMPALA-10152: Add support for Iceberg HiveCatalog

HiveCatalog is one of Iceberg's catalog implementations. It uses
the Hive metastore and it is the recommended catalog implementation
when the table data is stored in object stores like S3.

This commit updates the Iceberg version to a newer one, and it also
retrieves Iceberg from the CDP distribution because that version of
Iceberg is built against Hive 3 (Impala is only compatible with
Hive 3).

This commit makes HiveCatalog the default Iceberg catalog in Impala
because it can be used in more environments (e.g. cloud stores),
and it is more featureful. Also, other engines that store their
table metadata in HMS will probably use HiveCatalog as well.

Tables stored in HiveCatalog are similar to Kudu tables with HMS
integration, i.e. modifying an Iceberg table via the Iceberg APIs
also modifies the HMS table. So in CatalogOpExecutor we handle
such Iceberg tables similarly to integrated Kudu tables.

Testing:
 * Added e2e tests for creating, writing, and altering Iceberg
   tables
 * Added SHOW CREATE TABLE tests

Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
14 files changed, 524 insertions(+), 90 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/4
--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-19 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG@29
PS1, Line 29: e2e
> e2e
Done


http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2012
PS2, Line 2012:
> This will "double drop" Kudu tables where  existingTbl instanceof Incomplet
We still need to invoke HMS dropTable for synchronized tables that don't have 
HMS integration enabled.

So the "double drop" can only happen when

 existingTbl instanceof IncompleteTable &&
 msTbl table could be retrieved &&
 isHmsIntegrationAutomatic(msTbl)

I'm not sure if we can hit such scenario with normal usage, but anyway I 
restricted this condition to Iceberg tables.


http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2015
PS2, Line 2015: !isHmsIntegrationA
> it calls dropTable, so needsHmsDropTable would clearer
Done



--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 19 Nov 2020 11:29:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-19 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16721

to look at the new patch set (#3).

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..

IMPALA-10152: Add support for Iceberg HiveCatalog

HiveCatalog is one of Iceberg's catalog implementations. It uses
the Hive metastore and it is the recommended catalog implementation
when the table data is stored in object stores like S3.

This commit updates the Iceberg version to a newer one, and it also
retrieves Iceberg from the CDP distribution because that version of
Iceberg is built against Hive 3 (Impala is only compatible with
Hive 3).

This commit makes HiveCatalog the default Iceberg catalog in Impala
because it can be used in more environments (e.g. cloud stores),
and it is more featureful. Also, other engines that store their
table metadata in HMS will probably use HiveCatalog as well.

Tables stored in HiveCatalog are similar to Kudu tables with HMS
integration, i.e. modifying an Iceberg table via the Iceberg APIs
also modifies the HMS table. So in CatalogOpExecutor we handle
such Iceberg tables similarly to integrated Kudu tables.

Testing:
 * Added e2e tests for creating, writing, and altering Iceberg
   tables
 * Added SHOW CREATE TABLE tests

Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
14 files changed, 524 insertions(+), 90 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/3
--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-18 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2012
PS2, Line 2012: existingTbl instanceof IncompleteTable
This will "double drop" Kudu tables where  existingTbl instanceof 
IncompleteTable, but msTbl table could be retrieved and it indicates a 
synchronyzed Kudu table, as we dropped them in line 1998. My guess is that this 
will result in an exception from HMS dropTable, leading to keeping the table in 
catalogd.


http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2015
PS2, Line 2015: needsHmsAlterTable
it calls dropTable, so needsHmsDropTable would clearer



--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Wed, 18 Nov 2020 16:48:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6665/


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Wed, 18 Nov 2020 15:57:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7671/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Wed, 18 Nov 2020 10:40:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6665/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Wed, 18 Nov 2020 10:32:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-18 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 2:

(16 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG@11
PS1, Line 11: when the table data is stored in object stores like S3.
> Just curios - is this related to eventual consistency? If yes, then I think
Iceberg requires that the underlying filesystem supports atomic renames. I'm 
not sure if S3Guard solves that.


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@606
PS1, Line 606: TIcebergCatalog catalog;
> Can you add a comment about HIVE_CATALOG being the default here?
Done


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@65
PS1, Line 65: return hiveCatalog_.createTable(identifier, schema, spec, 
location,
> remove comment
Done


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@73
PS1, Line 73: TableIdentifier tableId = 
IcebergUtil.getIcebergTableIdentifier(feTable);
> nit: +2 indent
Done


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@81
PS1, Line 81: try {
> Can we check for tableLocation==null too?
No, in Iceberg util we pass both tableId and tableLocation to make the code 
simpler.


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@83
PS1, Line 83: } catch (Exception e) {
> I am not 100% sure, but I think it would be better to catch all exceptions
I wrapped them into TableLoadingException.


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@93
PS1, Line 93: TableIdentifier tableId = 
IcebergUtil.getIcebergTableIdentifier(feTable);
> nit: +2 indent
Done


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1870
PS1, Line 1870: Iceberg'
> Iceberg
Done


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1960
PS1, Line 1960:
> now this is needed for Iceberg tables too
Done


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1976
PS1, Line 1976:   throw new CatalogException(errorMsg);
  :   }
  :
  :   // Retrieve the HMS table to determine if this is a Kudu 
or Iceberg table.
  :   org.apache.hadoop.hive.metastore.api.Table msTbl = 
existingTbl.getMetaStoreTable();
  :   if (msTbl == null) {
  : Preconditions.checkState(existingTbl instanceof 
IncompleteTable);
  : Stopwatch hmsLoadSW = Stopwatch.createStarted();
  : long hmsLoadTime;
> These codes seems similar, can we extract to a method?
I don't think I can do that without some additional refactorings. If I had 
moved isSynchronizedTable() from KuduTable and IcebergTable to Table, I would 
still need to branch based on 'isKuduTable()/isIcebergTable()' because 
KuduCatalogOpexecutor and IcebergCatalogOpExecutor doesn't have a common base 
class. I don't want to do too much refactorings in the context of this patch, 
so I might just leave it as it is.


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1985
PS1, Line 1985: y (MetaStoreClient msClient = catalog_.g
> This case (existingTbl instanceof IncompleteTable && isSynchronizedIcebergT
Synchronized table doesn't mean that the table is stored in HiveCatalog. It 
means that the 'external.table.purge' property is true. But the Iceberg table 
might be stored in HadoopTables or HadoopCatalog.

An Iceberg table is incomplete if we couldn't load it via the Iceberg API, 
therefore we cannot execute Iceberg DROP TABLE.

existingTbl instanceof IncompleteTable && isSynchronizedIcebergTable == true is 
quite of an edge case, but it can happen when the underlying directory is 
deleted outside of 

[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-18 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16721

to look at the new patch set (#2).

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..

IMPALA-10152: Add support for Iceberg HiveCatalog

HiveCatalog is one of Iceberg's catalog implementations. It uses
the Hive metastore and it is the recommended catalog implementation
when the table data is stored in object stores like S3.

This commit updates the Iceberg version to a newer one, and it also
retrieves Iceberg from the CDP distribution because that version of
Iceberg is built against Hive 3 (Impala is only compatible with
Hive 3).

This commit makes HiveCatalog the default Iceberg catalog in Impala
because it can be used in more environments (e.g. cloud stores),
and it is more featureful. Also, other engines that store their
table metadata in HMS will probably use HiveCatalog as well.

Tables stored in HiveCatalog are similar to Kudu tables with HMS
integration, i.e. modifying an Iceberg table via the Iceberg APIs
also modifies the HMS table. So in CatalogOpExecutor we handle
such Iceberg tables similarly to integrated Kudu tables.

Testing:
 * Added e2d tests for creating, writing, and altering Iceberg
   tables
 * Added SHOW CREATE TABLE tests

Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
14 files changed, 523 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/2
--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-16 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 1:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG@11
PS1, Line 11: when the table data is stored in object stores like S3.
Just curios - is this related to eventual consistency? If yes, then I think 
that we should mention it, as S3Guard should make S3 usable even with other 
catalogs.


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@606
PS1, Line 606: if (catalogStr == null || catalogStr.isEmpty()) {
Can you add a comment about HIVE_CATALOG being the default here?


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@73
PS1, Line 73:   feTable.getIcebergCatalog() == 
TIcebergCatalog.HIVE_CATALOG);
nit: +2 indent


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@81
PS1, Line 81: Preconditions.checkState(tableId != null);
Can we check for tableLocation==null too?


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@83
PS1, Line 83:   return hiveCatalog_.loadTable(tableId);
I am not 100% sure, but I think it would be better to catch all exceptions from 
API calls and wrap them in ImpalaRuntimeException.


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@93
PS1, Line 93:   feTable.getIcebergCatalog() == 
TIcebergCatalog.HIVE_CATALOG);
nit: +2 indent


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1960
PS1, Line 1960: Kudu
now this is needed for Iceberg tables too


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1985
PS1, Line 1985: !(existingTbl instanceof IncompleteTable
This case (existingTbl instanceof IncompleteTable && isSynchronizedIcebergTable 
== true) is not clear to me - is it possible for an Iceberg table to be 
IncompleteTable at this point and what do we want to do in this case?

My understanding is that we allow IncompleteTables to be able to cleanup up 
"junk" tables that fail to load, e.g. Kudu tables that were deleted from Kudu 
but still exist in Impala and HMS (if it wouldn't exist in HMS, then line 1967 
wouldn't be able to retrieve msTbl). This is the point where we explicitly 
allow this: 
https://github.com/apache/impala/blob/6360657cb4d3b7655d9ff80958b2694ae4609370/fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java#L128

If I understand things correctly, your implementation will not drop the table 
via Iceberg API, and also not drop it via HMS API, but it will be removed from 
Impala catalog at line 2027, causing HMS and catalogd to be out of sync.

This is not a huge issue, as an Iceberg table failing to load means that 
something is already not ok, but its handling could be easily improved:
a. Throw an exception if isSynchronizedIcebergTable is true based on msTbl, but 
it is an IncompleteTable. This would notify the user that something is wrong 
and keep HMS and catalogd in sync.
b. Try to drop the table via Iceberg API even if it is an IncompleteTable - 
this would need changes in Iceberg dropTable functions to be able to work with 
a single HMS table argument.


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1991
PS1, Line 1991:   // Check to make sure we don't drop a view with "drop 
table" statement and
  :   // vice versa. is_table field is marked optional in 
TDropTableOrViewParams to
  :   // maintain catalog api compatibility.
  :   // TODO: Remove params.isSetIs_table() check once catalog 
api compatibility is
  :   // fixed.
  :   if (params.isSetIs_table() && ((params.is_table && 
existingTbl instanceof View)
  :   || (!params.is_table && !(existingTbl instanceof 
View {
  : String errorMsg = "DROP " 

[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-16 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 1:

(3 comments)

Thanks for new feature, Zoltan. Just some nits. I will take a deeper look 
tomorrow.

http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1870
PS1, Line 1870: Icebergs
Iceberg


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1976
PS1, Line 1976: boolean isSynchronizedKuduTable = msTbl != null &&
  :   KuduTable.isKuduTable(msTbl) && 
KuduTable.isSynchronizedTable(msTbl);
  :   if (isSynchronizedKuduTable) {
  : KuduCatalogOpExecutor.dropTable(msTbl, /* if exists */ 
true);
  :   }
  :
  :   boolean isSynchronizedIcebergTable = msTbl != null &&
  :   IcebergTable.isIcebergTable(msTbl) &&
  :   IcebergTable.isSynchronizedTable(msTbl);
These codes seems similar, can we extract to a method?


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1985
PS1, Line 1985: if (!(existingTbl instanceof IncompleteTable) &&
  :   isSynchronizedIcebergTable)
One line is ok, unnecessary to two lines.



--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 16 Nov 2020 09:36:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6653/


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 13 Nov 2020 22:59:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6653/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 13 Nov 2020 17:36:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7648/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 13 Nov 2020 17:30:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-13 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16721 )

Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG@29
PS1, Line 29: e2d
e2e


http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@65
PS1, Line 65: // We pass null as 'location' to let Iceberg decide the table 
location.
remove comment


http://gerrit.cloudera.org:8080/#/c/16721/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test
File testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test:

http://gerrit.cloudera.org:8080/#/c/16721/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test@268
PS1, Line 268: 
add test with custom table location



--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 13 Nov 2020 17:18:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog

2020-11-13 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16721


Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog
..

IMPALA-10152: Add support for Iceberg HiveCatalog

HiveCatalog is one of Iceberg's catalog implementations. It uses
the Hive metastore and it is the recommended catalog implementation
when the table data is stored in object stores like S3.

This commit updates the Iceberg version to a newer one, and it also
retrieves Iceberg from the CDP distribution because that version of
Iceberg is built against Hive 3 (Impala is only compatible with
Hive 3).

This commit makes HiveCatalog the default Iceberg catalog in Impala
because it can be used in more environments (e.g. cloud stores),
and it is more featureful. Also, other engines that store their
table metadata in HMS will probably use HiveCatalog as well.

Tables stored in HiveCatalog are similar to Kudu tables with HMS
integration, i.e. modifying an Iceberg table via the Iceberg APIs
also modifies the HMS table. So in CatalogOpExecutor we handle
such Iceberg tables similarly to integrated Kudu tables.

Testing:
 * Added e2d tests for creating, writing, and altering Iceberg
   tables

Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
13 files changed, 408 insertions(+), 62 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/1
--
To view, visit http://gerrit.cloudera.org:8080/16721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197
Gerrit-Change-Number: 16721
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy