[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Zoltan Borok-Nagy has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. IMPALA-10152: Add support for Iceberg HiveCatalog HiveCatalog is one of Iceberg's catalog implementations. It uses the Hive metastore and it is the recommended catalog implementation when the table data is stored in object stores like S3. This commit updates the Iceberg version to a newer one, and it also retrieves Iceberg from the CDP distribution because that version of Iceberg is built against Hive 3 (Impala is only compatible with Hive 3). This commit makes HiveCatalog the default Iceberg catalog in Impala because it can be used in more environments (e.g. cloud stores), and it is more featureful. Also, other engines that store their table metadata in HMS will probably use HiveCatalog as well. Tables stored in HiveCatalog are similar to Kudu tables with HMS integration, i.e. modifying an Iceberg table via the Iceberg APIs also modifies the HMS table. So in CatalogOpExecutor we handle such Iceberg tables similarly to integrated Kudu tables. Testing: * Added e2e tests for creating, writing, and altering Iceberg tables * Added SHOW CREATE TABLE tests Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Reviewed-on: http://gerrit.cloudera.org:8080/16721 Reviewed-by: wangsheng Tested-by: Impala Public Jenkins --- M bin/impala-config.sh M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 16 files changed, 577 insertions(+), 92 deletions(-) Approvals: wangsheng: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 20 Nov 2020 20:39:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 7: Thanks for the review! After +2 we can run the verify job with DRY_RUN=false, so on success the job submits the patch. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 20 Nov 2020 15:18:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6686/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 20 Nov 2020 15:10:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 7: Code-Review+2 Thanks for this new feature, Zoltan, LGTM! -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 20 Nov 2020 15:10:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7698/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 20 Nov 2020 13:40:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 7: (2 comments) Thanks for the quick review! http://gerrit.cloudera.org:8080/#/c/16721/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/16721/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@84 PS6, Line 84: // Internal Iceberg table property that specifies the absolute path of the current : // table metadata. > We may add some explain here: this propery is only valid for 'hive.catalog' Done http://gerrit.cloudera.org:8080/#/c/16721/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test: http://gerrit.cloudera.org:8080/#/c/16721/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test@127 PS6, Line 127: CREATE TABLE iceberg_hadoop_cat_with_metadata_locacti > Shall we add a test for HadoopCatalog here? Done -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 20 Nov 2020 13:18:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16721 to look at the new patch set (#7). Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. IMPALA-10152: Add support for Iceberg HiveCatalog HiveCatalog is one of Iceberg's catalog implementations. It uses the Hive metastore and it is the recommended catalog implementation when the table data is stored in object stores like S3. This commit updates the Iceberg version to a newer one, and it also retrieves Iceberg from the CDP distribution because that version of Iceberg is built against Hive 3 (Impala is only compatible with Hive 3). This commit makes HiveCatalog the default Iceberg catalog in Impala because it can be used in more environments (e.g. cloud stores), and it is more featureful. Also, other engines that store their table metadata in HMS will probably use HiveCatalog as well. Tables stored in HiveCatalog are similar to Kudu tables with HMS integration, i.e. modifying an Iceberg table via the Iceberg APIs also modifies the HMS table. So in CatalogOpExecutor we handle such Iceberg tables similarly to integrated Kudu tables. Testing: * Added e2e tests for creating, writing, and altering Iceberg tables * Added SHOW CREATE TABLE tests Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 --- M bin/impala-config.sh M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 16 files changed, 577 insertions(+), 92 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/7 -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 6: (2 comments) Thanks for a quick turnaround, just two nits. http://gerrit.cloudera.org:8080/#/c/16721/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/16721/6/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@84 PS6, Line 84: // Internal Iceberg table property that specifies the absolute path of the current : // table metadata. We may add some explain here: this propery is only valid for 'hive.catalog' or 'HiveCatalog' http://gerrit.cloudera.org:8080/#/c/16721/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test: http://gerrit.cloudera.org:8080/#/c/16721/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test@127 PS6, Line 127: CREATE TABLE iceberg_hive_cat_with_metadata_locaction Shall we add a test for HadoopCatalog here? -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 20 Nov 2020 11:22:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7695/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 20 Nov 2020 10:05:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7694/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 20 Nov 2020 10:04:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 6: (3 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java: http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@419 PS4, Line 419: cebergTable.METAD > This table property is generated by HiveCatalog, I'm curious about: Good point! 'metadata_location' is internal to Iceberg so we shouldn't allow users modifying it. Updated the code accordingly. http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java: http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@65 PS4, Line 65: return hiveCatalog_.createTable(identifier, schema, spec, location, properties); : } > nits: One line is ok, unnecessary for two lines. Done http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@242 PS4, Line 242:* Get TIcebergFileFormat from a string, usually from table properties. :* > Maybe we need add a comment here, since we change the default value to 'PAR Updated the code a bit. Now this method is returning PARQUET when 'format' is null. Format can be null when the table was created by other engines. And returning null when the format string is invalid. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 20 Nov 2020 09:45:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16721 to look at the new patch set (#6). Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. IMPALA-10152: Add support for Iceberg HiveCatalog HiveCatalog is one of Iceberg's catalog implementations. It uses the Hive metastore and it is the recommended catalog implementation when the table data is stored in object stores like S3. This commit updates the Iceberg version to a newer one, and it also retrieves Iceberg from the CDP distribution because that version of Iceberg is built against Hive 3 (Impala is only compatible with Hive 3). This commit makes HiveCatalog the default Iceberg catalog in Impala because it can be used in more environments (e.g. cloud stores), and it is more featureful. Also, other engines that store their table metadata in HMS will probably use HiveCatalog as well. Tables stored in HiveCatalog are similar to Kudu tables with HMS integration, i.e. modifying an Iceberg table via the Iceberg APIs also modifies the HMS table. So in CatalogOpExecutor we handle such Iceberg tables similarly to integrated Kudu tables. Testing: * Added e2e tests for creating, writing, and altering Iceberg tables * Added SHOW CREATE TABLE tests Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 --- M bin/impala-config.sh M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 16 files changed, 568 insertions(+), 92 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/6 -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16721 to look at the new patch set (#5). Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. IMPALA-10152: Add support for Iceberg HiveCatalog HiveCatalog is one of Iceberg's catalog implementations. It uses the Hive metastore and it is the recommended catalog implementation when the table data is stored in object stores like S3. This commit updates the Iceberg version to a newer one, and it also retrieves Iceberg from the CDP distribution because that version of Iceberg is built against Hive 3 (Impala is only compatible with Hive 3). This commit makes HiveCatalog the default Iceberg catalog in Impala because it can be used in more environments (e.g. cloud stores), and it is more featureful. Also, other engines that store their table metadata in HMS will probably use HiveCatalog as well. Tables stored in HiveCatalog are similar to Kudu tables with HMS integration, i.e. modifying an Iceberg table via the Iceberg APIs also modifies the HMS table. So in CatalogOpExecutor we handle such Iceberg tables similarly to integrated Kudu tables. Testing: * Added e2e tests for creating, writing, and altering Iceberg tables * Added SHOW CREATE TABLE tests Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 --- M bin/impala-config.sh M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 16 files changed, 568 insertions(+), 92 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/5 -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java: http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@419 PS4, Line 419: metadata_location This table property is generated by HiveCatalog, I'm curious about: 1. Can we set this table property when creating table? If not, I think we need to add some check in code; 2. Can we alter table to set this table property? If not, we also need to add some check in code; 3. Maybe we should define a static variable in IcebergTable.java, just like ICEBERG_CATALOG, and we can use a reference here. As far as I know, 'metadata_location' is refer to a metadata file's absolute path, and maybe we cannot modify this property manually. http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java: http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@65 PS4, Line 65: return hiveCatalog_.createTable(identifier, schema, spec, location, : properties); nits: One line is ok, unnecessary for two lines. http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/16721/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@242 PS4, Line 242:* Get TIcebergFileFormat from a string, usually from table properties :*/ Maybe we need add a comment here, since we change the default value to 'PARQUET' to replace 'null' -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 20 Nov 2020 07:59:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 4: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2012 PS2, Line 2012: ncompleteTable && isSynchronizedIceber > We still need to invoke HMS dropTable for synchronized tables that don't ha I am also unsure about this scenario, but I preferred not to change the original handling. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 19 Nov 2020 15:10:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7688/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 19 Nov 2020 11:58:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7687/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 19 Nov 2020 11:48:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 4: PS4 is only a rebase. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 19 Nov 2020 11:42:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16721 to look at the new patch set (#4). Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. IMPALA-10152: Add support for Iceberg HiveCatalog HiveCatalog is one of Iceberg's catalog implementations. It uses the Hive metastore and it is the recommended catalog implementation when the table data is stored in object stores like S3. This commit updates the Iceberg version to a newer one, and it also retrieves Iceberg from the CDP distribution because that version of Iceberg is built against Hive 3 (Impala is only compatible with Hive 3). This commit makes HiveCatalog the default Iceberg catalog in Impala because it can be used in more environments (e.g. cloud stores), and it is more featureful. Also, other engines that store their table metadata in HMS will probably use HiveCatalog as well. Tables stored in HiveCatalog are similar to Kudu tables with HMS integration, i.e. modifying an Iceberg table via the Iceberg APIs also modifies the HMS table. So in CatalogOpExecutor we handle such Iceberg tables similarly to integrated Kudu tables. Testing: * Added e2e tests for creating, writing, and altering Iceberg tables * Added SHOW CREATE TABLE tests Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 --- M bin/impala-config.sh M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 14 files changed, 524 insertions(+), 90 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/4 -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG@29 PS1, Line 29: e2e > e2e Done http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2012 PS2, Line 2012: > This will "double drop" Kudu tables where existingTbl instanceof Incomplet We still need to invoke HMS dropTable for synchronized tables that don't have HMS integration enabled. So the "double drop" can only happen when existingTbl instanceof IncompleteTable && msTbl table could be retrieved && isHmsIntegrationAutomatic(msTbl) I'm not sure if we can hit such scenario with normal usage, but anyway I restricted this condition to Iceberg tables. http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2015 PS2, Line 2015: !isHmsIntegrationA > it calls dropTable, so needsHmsDropTable would clearer Done -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 19 Nov 2020 11:29:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16721 to look at the new patch set (#3). Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. IMPALA-10152: Add support for Iceberg HiveCatalog HiveCatalog is one of Iceberg's catalog implementations. It uses the Hive metastore and it is the recommended catalog implementation when the table data is stored in object stores like S3. This commit updates the Iceberg version to a newer one, and it also retrieves Iceberg from the CDP distribution because that version of Iceberg is built against Hive 3 (Impala is only compatible with Hive 3). This commit makes HiveCatalog the default Iceberg catalog in Impala because it can be used in more environments (e.g. cloud stores), and it is more featureful. Also, other engines that store their table metadata in HMS will probably use HiveCatalog as well. Tables stored in HiveCatalog are similar to Kudu tables with HMS integration, i.e. modifying an Iceberg table via the Iceberg APIs also modifies the HMS table. So in CatalogOpExecutor we handle such Iceberg tables similarly to integrated Kudu tables. Testing: * Added e2e tests for creating, writing, and altering Iceberg tables * Added SHOW CREATE TABLE tests Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 --- M bin/impala-config.sh M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 14 files changed, 524 insertions(+), 90 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/3 -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2012 PS2, Line 2012: existingTbl instanceof IncompleteTable This will "double drop" Kudu tables where existingTbl instanceof IncompleteTable, but msTbl table could be retrieved and it indicates a synchronyzed Kudu table, as we dropped them in line 1998. My guess is that this will result in an exception from HMS dropTable, leading to keeping the table in catalogd. http://gerrit.cloudera.org:8080/#/c/16721/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2015 PS2, Line 2015: needsHmsAlterTable it calls dropTable, so needsHmsDropTable would clearer -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 18 Nov 2020 16:48:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6665/ -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 18 Nov 2020 15:57:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7671/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 18 Nov 2020 10:40:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6665/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 18 Nov 2020 10:32:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 2: (16 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG@11 PS1, Line 11: when the table data is stored in object stores like S3. > Just curios - is this related to eventual consistency? If yes, then I think Iceberg requires that the underlying filesystem supports atomic renames. I'm not sure if S3Guard solves that. http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java: http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@606 PS1, Line 606: TIcebergCatalog catalog; > Can you add a comment about HIVE_CATALOG being the default here? Done http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java: http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@65 PS1, Line 65: return hiveCatalog_.createTable(identifier, schema, spec, location, > remove comment Done http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@73 PS1, Line 73: TableIdentifier tableId = IcebergUtil.getIcebergTableIdentifier(feTable); > nit: +2 indent Done http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@81 PS1, Line 81: try { > Can we check for tableLocation==null too? No, in Iceberg util we pass both tableId and tableLocation to make the code simpler. http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@83 PS1, Line 83: } catch (Exception e) { > I am not 100% sure, but I think it would be better to catch all exceptions I wrapped them into TableLoadingException. http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@93 PS1, Line 93: TableIdentifier tableId = IcebergUtil.getIcebergTableIdentifier(feTable); > nit: +2 indent Done http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1870 PS1, Line 1870: Iceberg' > Iceberg Done http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1960 PS1, Line 1960: > now this is needed for Iceberg tables too Done http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1976 PS1, Line 1976: throw new CatalogException(errorMsg); : } : : // Retrieve the HMS table to determine if this is a Kudu or Iceberg table. : org.apache.hadoop.hive.metastore.api.Table msTbl = existingTbl.getMetaStoreTable(); : if (msTbl == null) { : Preconditions.checkState(existingTbl instanceof IncompleteTable); : Stopwatch hmsLoadSW = Stopwatch.createStarted(); : long hmsLoadTime; > These codes seems similar, can we extract to a method? I don't think I can do that without some additional refactorings. If I had moved isSynchronizedTable() from KuduTable and IcebergTable to Table, I would still need to branch based on 'isKuduTable()/isIcebergTable()' because KuduCatalogOpexecutor and IcebergCatalogOpExecutor doesn't have a common base class. I don't want to do too much refactorings in the context of this patch, so I might just leave it as it is. http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1985 PS1, Line 1985: y (MetaStoreClient msClient = catalog_.g > This case (existingTbl instanceof IncompleteTable && isSynchronizedIcebergT Synchronized table doesn't mean that the table is stored in HiveCatalog. It means that the 'external.table.purge' property is true. But the Iceberg table might be stored in HadoopTables or HadoopCatalog. An Iceberg table is incomplete if we couldn't load it via the Iceberg API, therefore we cannot execute Iceberg DROP TABLE. existingTbl instanceof IncompleteTable && isSynchronizedIcebergTable == true is quite of an edge case, but it can happen when the underlying directory is deleted outside of
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16721 to look at the new patch set (#2). Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. IMPALA-10152: Add support for Iceberg HiveCatalog HiveCatalog is one of Iceberg's catalog implementations. It uses the Hive metastore and it is the recommended catalog implementation when the table data is stored in object stores like S3. This commit updates the Iceberg version to a newer one, and it also retrieves Iceberg from the CDP distribution because that version of Iceberg is built against Hive 3 (Impala is only compatible with Hive 3). This commit makes HiveCatalog the default Iceberg catalog in Impala because it can be used in more environments (e.g. cloud stores), and it is more featureful. Also, other engines that store their table metadata in HMS will probably use HiveCatalog as well. Tables stored in HiveCatalog are similar to Kudu tables with HMS integration, i.e. modifying an Iceberg table via the Iceberg APIs also modifies the HMS table. So in CatalogOpExecutor we handle such Iceberg tables similarly to integrated Kudu tables. Testing: * Added e2d tests for creating, writing, and altering Iceberg tables * Added SHOW CREATE TABLE tests Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 --- M bin/impala-config.sh M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 14 files changed, 523 insertions(+), 91 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/2 -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 1: (11 comments) http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG@11 PS1, Line 11: when the table data is stored in object stores like S3. Just curios - is this related to eventual consistency? If yes, then I think that we should mention it, as S3Guard should make S3 usable even with other catalogs. http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java: http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@606 PS1, Line 606: if (catalogStr == null || catalogStr.isEmpty()) { Can you add a comment about HIVE_CATALOG being the default here? http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java: http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@73 PS1, Line 73: feTable.getIcebergCatalog() == TIcebergCatalog.HIVE_CATALOG); nit: +2 indent http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@81 PS1, Line 81: Preconditions.checkState(tableId != null); Can we check for tableLocation==null too? http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@83 PS1, Line 83: return hiveCatalog_.loadTable(tableId); I am not 100% sure, but I think it would be better to catch all exceptions from API calls and wrap them in ImpalaRuntimeException. http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@93 PS1, Line 93: feTable.getIcebergCatalog() == TIcebergCatalog.HIVE_CATALOG); nit: +2 indent http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1960 PS1, Line 1960: Kudu now this is needed for Iceberg tables too http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1985 PS1, Line 1985: !(existingTbl instanceof IncompleteTable This case (existingTbl instanceof IncompleteTable && isSynchronizedIcebergTable == true) is not clear to me - is it possible for an Iceberg table to be IncompleteTable at this point and what do we want to do in this case? My understanding is that we allow IncompleteTables to be able to cleanup up "junk" tables that fail to load, e.g. Kudu tables that were deleted from Kudu but still exist in Impala and HMS (if it wouldn't exist in HMS, then line 1967 wouldn't be able to retrieve msTbl). This is the point where we explicitly allow this: https://github.com/apache/impala/blob/6360657cb4d3b7655d9ff80958b2694ae4609370/fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java#L128 If I understand things correctly, your implementation will not drop the table via Iceberg API, and also not drop it via HMS API, but it will be removed from Impala catalog at line 2027, causing HMS and catalogd to be out of sync. This is not a huge issue, as an Iceberg table failing to load means that something is already not ok, but its handling could be easily improved: a. Throw an exception if isSynchronizedIcebergTable is true based on msTbl, but it is an IncompleteTable. This would notify the user that something is wrong and keep HMS and catalogd in sync. b. Try to drop the table via Iceberg API even if it is an IncompleteTable - this would need changes in Iceberg dropTable functions to be able to work with a single HMS table argument. http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1991 PS1, Line 1991: // Check to make sure we don't drop a view with "drop table" statement and : // vice versa. is_table field is marked optional in TDropTableOrViewParams to : // maintain catalog api compatibility. : // TODO: Remove params.isSetIs_table() check once catalog api compatibility is : // fixed. : if (params.isSetIs_table() && ((params.is_table && existingTbl instanceof View) : || (!params.is_table && !(existingTbl instanceof View { : String errorMsg = "DROP "
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 1: (3 comments) Thanks for new feature, Zoltan. Just some nits. I will take a deeper look tomorrow. http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1870 PS1, Line 1870: Icebergs Iceberg http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1976 PS1, Line 1976: boolean isSynchronizedKuduTable = msTbl != null && : KuduTable.isKuduTable(msTbl) && KuduTable.isSynchronizedTable(msTbl); : if (isSynchronizedKuduTable) { : KuduCatalogOpExecutor.dropTable(msTbl, /* if exists */ true); : } : : boolean isSynchronizedIcebergTable = msTbl != null && : IcebergTable.isIcebergTable(msTbl) && : IcebergTable.isSynchronizedTable(msTbl); These codes seems similar, can we extract to a method? http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1985 PS1, Line 1985: if (!(existingTbl instanceof IncompleteTable) && : isSynchronizedIcebergTable) One line is ok, unnecessary to two lines. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 16 Nov 2020 09:36:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6653/ -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 13 Nov 2020 22:59:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6653/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 13 Nov 2020 17:36:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7648/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 13 Nov 2020 17:30:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16721 ) Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16721/1//COMMIT_MSG@29 PS1, Line 29: e2d e2e http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java: http://gerrit.cloudera.org:8080/#/c/16721/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java@65 PS1, Line 65: // We pass null as 'location' to let Iceberg decide the table location. remove comment http://gerrit.cloudera.org:8080/#/c/16721/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test: http://gerrit.cloudera.org:8080/#/c/16721/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test@268 PS1, Line 268: add test with custom table location -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 13 Nov 2020 17:18:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10152: Add support for Iceberg HiveCatalog
Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16721 Change subject: IMPALA-10152: Add support for Iceberg HiveCatalog .. IMPALA-10152: Add support for Iceberg HiveCatalog HiveCatalog is one of Iceberg's catalog implementations. It uses the Hive metastore and it is the recommended catalog implementation when the table data is stored in object stores like S3. This commit updates the Iceberg version to a newer one, and it also retrieves Iceberg from the CDP distribution because that version of Iceberg is built against Hive 3 (Impala is only compatible with Hive 3). This commit makes HiveCatalog the default Iceberg catalog in Impala because it can be used in more environments (e.g. cloud stores), and it is more featureful. Also, other engines that store their table metadata in HMS will probably use HiveCatalog as well. Tables stored in HiveCatalog are similar to Kudu tables with HMS integration, i.e. modifying an Iceberg table via the Iceberg APIs also modifies the HMS table. So in CatalogOpExecutor we handle such Iceberg tables similarly to integrated Kudu tables. Testing: * Added e2d tests for creating, writing, and altering Iceberg tables Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 --- M bin/impala-config.sh M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 13 files changed, 408 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/16721/1 -- To view, visit http://gerrit.cloudera.org:8080/16721 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie574589a1751aaa9ccbd34a89c6819714d103197 Gerrit-Change-Number: 16721 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy