[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. IMPALA-11362: Add expire snapshots functionality for Iceberg tables Iceberg table modifications can create new table snapshots, these snapshots can be used to access an earlier version of the table. During the lifetime of a table the number of snapshots can accumulate and older versions can become obsolete as well. Iceberg API provides a safe solution to remove the snapshots that are not needed anymore, the operation is called ExpireSnapshots. This commit adds framework to execute Iceberg maintenance operation on tables and implements the call of an expire snapshots operation. The following statement becomes available for Iceberg tables: - ALTER TABLE EXECUTE expire_snapshots() ExpireSnapshots Iceberg API calls were meant to be chained, the calls are expireSnapshotId, expireOlderThan and retainLast. SQL is less suitable for chained calls, therefore this commit implements only the expireOlderThan functionality. However, in this case the retainLast call will fall back to its default value (1), this default can be configured with TableProperties.MIN_SNAPSHOTS_TO_KEEP. This commit also refactors the Iceberg e2e tests and introduces ImpalaTestSuite for the common methods that are used in test which compare snapshot versions in some way. Testing: - Added analysis unit tests. - Added e2e tests. Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Reviewed-on: http://gerrit.cloudera.org:8080/18688 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A tests/common/iceberg_test_suite.py M tests/query_test/test_iceberg.py 9 files changed, 353 insertions(+), 78 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 11 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 10 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 09 Aug 2022 15:51:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/18688/10/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/18688/10/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java@183 PS10, Line 183: expireApi.expireOlderThan(params.older_than_millis); : expireApi.commit(); nit: you could link these function calls like: expireApi.expireOlderThan(...).commit(); -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 10 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 09 Aug 2022 12:27:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8420/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 10 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 09 Aug 2022 11:00:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 10 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 09 Aug 2022 11:00:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 9: Verify job failure caused by IMPALA-11160. -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 9 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 09 Aug 2022 11:00:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 9: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8419/ -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 9 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 09 Aug 2022 10:08:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8419/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 9 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 09 Aug 2022 05:12:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 9 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 09 Aug 2022 05:12:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 8: Verify job failed because of a flaky test IMPALA-11352. -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 8 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 09 Aug 2022 05:11:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 8: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8412/ -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 8 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 08 Aug 2022 18:30:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 7 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 08 Aug 2022 13:54:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 8 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 08 Aug 2022 13:34:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Tamas Mate has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. IMPALA-11362: Add expire snapshots functionality for Iceberg tables Iceberg table modifications can create new table snapshots, these snapshots can be used to access an earlier version of the table. During the lifetime of a table the number of snapshots can accumulate and older versions can become obsolete as well. Iceberg API provides a safe solution to remove the snapshots that are not needed anymore, the operation is called ExpireSnapshots. This commit adds framework to execute Iceberg maintenance operation on tables and implements the call of an expire snapshots operation. The following statement becomes available for Iceberg tables: - ALTER TABLE EXECUTE expire_snapshots() ExpireSnapshots Iceberg API calls were meant to be chained, the calls are expireSnapshotId, expireOlderThan and retainLast. SQL is less suitable for chained calls, therefore this commit implements only the expireOlderThan functionality. However, in this case the retainLast call will fall back to its default value (1), this default can be configured with TableProperties.MIN_SNAPSHOTS_TO_KEEP. This commit also refactors the Iceberg e2e tests and introduces ImpalaTestSuite for the common methods that are used in test which compare snapshot versions in some way. Testing: - Added analysis unit tests. - Added e2e tests. Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A tests/common/iceberg_test_suite.py M tests/query_test/test_iceberg.py 9 files changed, 353 insertions(+), 78 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/18688/7 -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 7 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 7: Thanks again for the reviews Zoltan! :) -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 7 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 08 Aug 2022 13:34:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8412/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 8 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 08 Aug 2022 13:34:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 6 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 08 Aug 2022 12:02:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 6: Code-Review+1 (1 comment) One small comment, otherwise LGTM! http://gerrit.cloudera.org:8080/#/c/18688/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/18688/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1326 PS6, Line 1326: "Snapshots have been expired." We might have other EXECUTE statements in the future, e.g. ROLLBACK. So maybe the alterTableExecute statement could return a summary. -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 6 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 08 Aug 2022 11:56:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Tamas Mate has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. IMPALA-11362: Add expire snapshots functionality for Iceberg tables Iceberg table modifications can create new table snapshots, these snapshots can be used to access an earlier version of the table. During the lifetime of a table the number of snapshots can accumulate and older versions can become obsolete as well. Iceberg API provides a safe solution to remove the snapshots that are not needed anymore, the operation is called ExpireSnapshots. This commit adds framework to execute Iceberg maintenance operation on tables and implements the call of an expire snapshots operation. The following statement becomes available for Iceberg tables: - ALTER TABLE EXECUTE expire_snapshots() ExpireSnapshots Iceberg API calls were meant to be chained, the calls are expireSnapshotId, expireOlderThan and retainLast. SQL is less suitable for chained calls, therefore this commit implements only the expireOlderThan functionality. However, in this case the retainLast call will fall back to its default value (1), this default can be configured with TableProperties.MIN_SNAPSHOTS_TO_KEEP. This commit also refactors the Iceberg e2e tests and introduces ImpalaTestSuite for the common methods that are used in test which compare snapshot versions in some way. Testing: - Added analysis unit tests. - Added e2e tests. Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A tests/common/iceberg_test_suite.py M tests/query_test/test_iceberg.py 9 files changed, 351 insertions(+), 78 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/18688/6 -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 6 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/18688/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/18688/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5533 PS5, Line 5533: private void alterTableExecute(Table tbl, TAlterTableExecuteParams params) { : FeIcebergTable iceTbl = (FeIcebergTable)tbl; : ExpireSnapshots expireApi = iceTbl.getIcebergApiTable().expireSnapshots(); : expireApi.expireOlderThan(params.older_than_millis); : expireApi.commit(); : } This method could be moved to IcebergCatalogOpExecutor. http://gerrit.cloudera.org:8080/#/c/18688/5/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/18688/5/tests/query_test/test_iceberg.py@68 PS5, Line 68: # We are setting the TIMEZONE query option in this test, so let's create a local : # impala client. We are not setting the timezone in this test. -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 04 Aug 2022 16:19:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 5: Code-Review+1 Nice work! -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Jul 2022 15:39:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10951/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Jul 2022 11:23:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10950/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 4 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Jul 2022 11:15:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Tamas Mate has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. IMPALA-11362: Add expire snapshots functionality for Iceberg tables Iceberg table modifications can create new table snapshots, these snapshots can be used to access an earlier version of the table. During the lifetime of a table the number of snapshots can accumulate and older versions can become obsolete as well. Iceberg API provides a safe solution to remove the snapshots that are not needed anymore, the operation is called ExpireSnapshots. This commit adds framework to execute Iceberg maintenance operation on tables and implements the call of an expire snapshots operation. The following statement becomes available for Iceberg tables: - ALTER TABLE EXECUTE expire_snapshots() ExpireSnapshots Iceberg API calls were meant to be chained, the calls are expireSnapshotId, expireOlderThan and retainLast. SQL is less suitable for chained calls, therefore this commit implements only the expireOlderThan functionality. However, in this case the retainLast call will fall back to its default value (1), this default can be configured with TableProperties.MIN_SNAPSHOTS_TO_KEEP. This commit also refactors the Iceberg e2e tests and introduces ImpalaTestSuite for the common methods that are used in test which compare snapshot versions in some way. Testing: - Added analysis unit tests. - Added e2e tests. Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A tests/common/iceberg_test_suite.py M tests/query_test/test_iceberg.py 8 files changed, 339 insertions(+), 78 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/18688/5 -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/18688/4/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/18688/4/tests/query_test/test_iceberg.py@242 PS4, Line 242: - > flake8: W504 line break after binary operator Done -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 4 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Jul 2022 11:01:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 4: (9 comments) Thank you for the review Zoli, updated the change. http://gerrit.cloudera.org:8080/#/c/18688/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18688/3//COMMIT_MSG@12 PS3, Line 12: a safe : solution > nit: safe solutions / a safe solution Done http://gerrit.cloudera.org:8080/#/c/18688/3/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: http://gerrit.cloudera.org:8080/#/c/18688/3/fe/src/main/cup/sql-parser.cup@1315 PS3, Line 1315: RESULT = new AlterTableExecuteStmt(table, expr); > Do we need this comment? We don't, removed. http://gerrit.cloudera.org:8080/#/c/18688/3/fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteStmt.java: http://gerrit.cloudera.org:8080/#/c/18688/3/fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteStmt.java@37 PS3, Line 37: : *TableProperties.MIN_SNAPSHOTS_TO_KEEP table property manages how many snapshots : *should be retained even when all snapshots are selected by expire > I'm not sure we need this part. I think we only need to explain the role of Done http://gerrit.cloudera.org:8080/#/c/18688/3/fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteStmt.java@46 PS3, Line 46: funct > nit: I think we use upper case letters for constants. Also, this could be s Done http://gerrit.cloudera.org:8080/#/c/18688/3/tests/common/iceberg_test_suite.py File tests/common/iceberg_test_suite.py: http://gerrit.cloudera.org:8080/#/c/18688/3/tests/common/iceberg_test_suite.py@26 PS3, Line 26: cls, > nit: class methods take a Class parameter which is usually named 'cls', not Done http://gerrit.cloudera.org:8080/#/c/18688/3/tests/common/iceberg_test_suite.py@40 PS3, Line 40: expect_num_snapshot > maybe: expect_num_snapshots_from Done http://gerrit.cloudera.org:8080/#/c/18688/3/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/18688/3/tests/query_test/test_iceberg.py@87 PS3, Line 87: ad_client.execut > nit: we just need execute as we don't use the return value. Done http://gerrit.cloudera.org:8080/#/c/18688/3/tests/query_test/test_iceberg.py@91 PS3, Line 91: interval > nit: interval Done http://gerrit.cloudera.org:8080/#/c/18688/3/tests/query_test/test_iceberg.py@95 PS3, Line 95: but retain 1 > We should also add a test which sets table property TableProperties.MIN_SNA Done -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 4 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Jul 2022 10:55:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Tamas Mate has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. IMPALA-11362: Add expire snapshots functionality for Iceberg tables Iceberg table modifications can create new table snapshots, these snapshots can be used to access an earlier version of the table. During the lifetime of a table the number of snapshots can accumulate and older versions can become obsolete as well. Iceberg API provides a safe solution to remove the snapshots that are not needed anymore, the operation is called ExpireSnapshots. This commit adds framework to execute Iceberg maintenance operation on tables and implements the call of an expire snapshots operation. The following statement becomes available for Iceberg tables: - ALTER TABLE EXECUTE expire_snapshots() ExpireSnapshots Iceberg API calls were meant to be chained, the calls are expireSnapshotId, expireOlderThan and retainLast. SQL is less suitable for chained calls, therefore this commit implements only the expireOlderThan functionality. However, in this case the retainLast call will fall back to its default value (1), this default can be configured with TableProperties.MIN_SNAPSHOTS_TO_KEEP. This commit also refactors the Iceberg e2e tests and introduces ImpalaTestSuite for the common methods that are used in test which compare snapshot versions in some way. Testing: - Added analysis unit tests. - Added e2e tests. Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A tests/common/iceberg_test_suite.py M tests/query_test/test_iceberg.py 8 files changed, 339 insertions(+), 78 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/18688/4 -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 4 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/18688/4/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/18688/4/tests/query_test/test_iceberg.py@242 PS4, Line 242: - flake8: W504 line break after binary operator -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 4 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Jul 2022 10:56:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10909/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 01 Jul 2022 08:14:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Tamas Mate has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. IMPALA-11362: Add expire snapshots functionality for Iceberg tables Iceberg table modifications can create new table snapshots, these snapshots can be used to access an earlier version of the table. During the lifetime of a table the number of snapshots can accumulate and older versions can become obsolete as well. Iceberg API provides a safe solutions to remove the snapshots that are not needed anymore, the operation is called ExpireSnapshots. This commit adds framework to execute Iceberg maintenance operation on tables and implements the call of an expire snapshots operation. The following statement becomes available for Iceberg tables: - ALTER TABLE EXECUTE expire_snapshots() ExpireSnapshots Iceberg API calls were meant to be chained, the calls are expireSnapshotId, expireOlderThan and retainLast. SQL is less suitable for chained calls, therefore this commit implements only the expireOlderThan functionality. However, in this case the retainLast call will fall back to its default value (1), this default can be configured with TableProperties.MIN_SNAPSHOTS_TO_KEEP. This commit also refactors the Iceberg e2e tests and introduces ImpalaTestSuite for the common methods that are used in test which compare snapshot versions in some way. Testing: - Added analysis unit tests. - Added e2e tests. Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A tests/common/iceberg_test_suite.py M tests/query_test/test_iceberg.py 8 files changed, 329 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/18688/2 -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10907/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 1 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 01 Jul 2022 07:34:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Tamas Mate has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18688 Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. IMPALA-11362: Add expire snapshots functionality for Iceberg tables Iceberg table modifications can create new table snapshots, these snapshots can be used to access an earlier version of the table. During the lifetime of a table the number of snapshots can accumulate and older versions can become obsolete as well. Iceberg API provides a safe solutions to remove the snapshots that are not needed anymore, the operation is called ExpireSnapshots. This commit adds framework to execute Iceberg maintenance operation on tables and implements the call of an expire snapshots operation. The following statement becomes available for Iceberg tables: - ALTER TABLE EXECUTE expire_snapshots() ExpireSnapshots Iceberg API calls were meant to be chained, the calls are expireSnapshotId, expireOlderThan and retainLast. SQL is less suitable for chained calls, therefore this commit implements only the expireOlderThan functionality. However, in this case the retainLast call will fall back to its default value (1), this default can be configured with TableProperties.MIN_SNAPSHOTS_TO_KEEP. This commit also refactors the Iceberg e2e tests and introduces ImpalaTestSuite for the common methods that are used in test which compare snapshot versions in some way. Testing: - Added analysis unit tests. - Added e2e tests. Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A tests/common/iceberg_test_suite.py M tests/query_test/test_iceberg.py 8 files changed, 327 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/18688/1 -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 1 Gerrit-Owner: Tamas Mate
[Impala-ASF-CR] IMPALA-11362: Add expire snapshots functionality for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18688 ) Change subject: IMPALA-11362: Add expire snapshots functionality for Iceberg tables .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/18688/1/tests/common/iceberg_test_suite.py File tests/common/iceberg_test_suite.py: http://gerrit.cloudera.org:8080/#/c/18688/1/tests/common/iceberg_test_suite.py@20 PS1, Line 20: from tests.common.impala_test_suite import ImpalaTestSuite, LOG flake8: F401 'tests.common.impala_test_suite.LOG' imported but unused http://gerrit.cloudera.org:8080/#/c/18688/1/tests/common/iceberg_test_suite.py@22 PS1, Line 22: class IcebergTestSuite(ImpalaTestSuite): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/18688/1/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/18688/1/tests/query_test/test_iceberg.py@37 PS1, Line 37: class TestIcebergTable(IcebergTestSuite): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/18688/1/tests/query_test/test_iceberg.py@90 PS1, Line 90: flake8: W291 trailing whitespace http://gerrit.cloudera.org:8080/#/c/18688/1/tests/query_test/test_iceberg.py@90 PS1, Line 90: self.execute_query_ts(impalad_client, expire_q.format(tbl_name, line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/18688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ideffee4964c18c85ca745bfb4eca08ec362416f3 Gerrit-Change-Number: 18688 Gerrit-PatchSet: 1 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 01 Jul 2022 07:15:31 + Gerrit-HasComments: Yes