[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Zoltan Borok-Nagy has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. IMPALA-11023: Raise error when delete file is found in an Iceberg table Iceberg V2 DeleteFiles are skipped during scans and the whole content of the DataFiles are returned. This commit adds an extra check to prevent scanning tables that have delete files to avoid unexpected results till merge on read is supported. Metadata operations are allowed on tables with delete files. Testing: - Added e2e test. Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Reviewed-on: http://gerrit.cloudera.org:8080/18383 Reviewed-by: Zoltan Borok-Nagy Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/data/README A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/0-0-fb178c51-b12a-4c5f-a66e-a8e9375daeba-1.parquet A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/00191-4-6e780302-527b-4911-8c6e-88d416adac57-1.parquet A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/0eadf173-0c84-4378-a9d0-5d7f47183978-m0.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/8cbef400-daea-478a-858a-2baf2438f644-m0.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-5725822353600261755-1-0eadf173-0c84-4378-a9d0-5d7f47183978.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-6816997371555012807-1-8cbef400-daea-478a-858a-2baf2438f644.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/v1.metadata.json A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/v2.metadata.json A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/version-hint.text M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test 17 files changed, 217 insertions(+), 9 deletions(-) Approvals: Zoltan Borok-Nagy: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 6 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 11 Apr 2022 19:35:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 5: Code-Review+2 Thanks for fixing this! LGTM! -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 11 Apr 2022 16:30:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8036/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 11 Apr 2022 15:08:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 5: This build hit IMPALA-7864, restarting dry_run. -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 11 Apr 2022 15:08:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8034/ -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 11 Apr 2022 15:03:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8034/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 11 Apr 2022 10:34:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10422/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 11 Apr 2022 08:35:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Tamas Mate has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. IMPALA-11023: Raise error when delete file is found in an Iceberg table Iceberg V2 DeleteFiles are skipped during scans and the whole content of the DataFiles are returned. This commit adds an extra check to prevent scanning tables that have delete files to avoid unexpected results till merge on read is supported. Metadata operations are allowed on tables with delete files. Testing: - Added e2e test. Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e --- M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/data/README A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/0-0-fb178c51-b12a-4c5f-a66e-a8e9375daeba-1.parquet A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/00191-4-6e780302-527b-4911-8c6e-88d416adac57-1.parquet A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/0eadf173-0c84-4378-a9d0-5d7f47183978-m0.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/8cbef400-daea-478a-858a-2baf2438f644-m0.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-5725822353600261755-1-0eadf173-0c84-4378-a9d0-5d7f47183978.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-6816997371555012807-1-8cbef400-daea-478a-858a-2baf2438f644.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/v1.metadata.json A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/v2.metadata.json A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/version-hint.text M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test 17 files changed, 217 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/18383/5 -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/18383/4/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/18383/4/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@122 PS4, Line 122: dataFile > Instead of invoking planFiles() multiple times (once in hasDeleteFile(), an The DeleteFile objects are stored separately in the FileScanTask and are not added to the DataFile list currently. https://github.com/apache/iceberg/blob/62a53fed6fad24616aea7170d254e529602fabf1/api/src/main/java/org/apache/iceberg/FileScanTask.java#L41 https://github.com/apache/impala/blob/7b235eebd5dda9074e2b7724e6b290f49c1bb8ce/fe/src/main/java/org/apache/impala/util/IcebergUtil.java#L533 There is a shared ancestor called ContentFile, but I do not think that we should merge them, because the DeleteFiles are linked to the DataFiles. The other option I could think of was to return the FileScanTask objects in the getIcebergDataFiles(), but in that case the caller should handle the DataFile collection. -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 4 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 08 Apr 2022 08:41:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/18383/4/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/18383/4/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@122 PS4, Line 122: dataFile Instead of invoking planFiles() multiple times (once in hasDeleteFile(), and once in getIcebergDataFiles()), we could just check dataFile.content() here: https://github.com/apache/iceberg/blob/4b382e293f946fdb966e70f2f598cd5daf550651/api/src/main/java/org/apache/iceberg/DataFile.java#L100 -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 4 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 07 Apr 2022 17:20:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 4 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 07 Apr 2022 17:13:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8030/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 4 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 07 Apr 2022 12:47:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10412/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 4 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 07 Apr 2022 12:33:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10411/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 3 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 07 Apr 2022 12:31:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Tamas Mate has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. IMPALA-11023: Raise error when delete file is found in an Iceberg table Iceberg V2 DeleteFiles are skipped during scans and the whole content of the DataFiles are returned. This commit adds an extra check to prevent scanning tables that have delete files to avoid unexpected results till merge on read is supported. Metadata operations are allowed on tables with delete files. Testing: - Added e2e test. Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e --- M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/data/README A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/0-0-fb178c51-b12a-4c5f-a66e-a8e9375daeba-1.parquet A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/00191-4-6e780302-527b-4911-8c6e-88d416adac57-1.parquet A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/0eadf173-0c84-4378-a9d0-5d7f47183978-m0.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/8cbef400-daea-478a-858a-2baf2438f644-m0.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-5725822353600261755-1-0eadf173-0c84-4378-a9d0-5d7f47183978.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-6816997371555012807-1-8cbef400-daea-478a-858a-2baf2438f644.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/v1.metadata.json A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/v2.metadata.json A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/version-hint.text M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test 15 files changed, 225 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/18383/4 -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 4 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Tamas Mate has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. IMPALA-11023: Raise error when delete file is found in an Iceberg table Iceberg V2 DeleteFiles are skipped during scans and the whole content of the DataFiles are returned. This commit adds an extra check to prevent scanning tables that have delete files to avoid unexpected results till merge on read is supported. Testing: - Added e2e test. Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e --- M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/data/README A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/0-0-fb178c51-b12a-4c5f-a66e-a8e9375daeba-1.parquet A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/00191-4-6e780302-527b-4911-8c6e-88d416adac57-1.parquet A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/0eadf173-0c84-4378-a9d0-5d7f47183978-m0.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/8cbef400-daea-478a-858a-2baf2438f644-m0.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-5725822353600261755-1-0eadf173-0c84-4378-a9d0-5d7f47183978.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-6816997371555012807-1-8cbef400-daea-478a-858a-2baf2438f644.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/v1.metadata.json A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/v2.metadata.json A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/version-hint.text M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test 14 files changed, 199 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/18383/3 -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 3 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8027/ -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 06 Apr 2022 13:41:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 2: (1 comment) Thank you for the review Zoltan, updated the CR. http://gerrit.cloudera.org:8080/#/c/18383/2/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/18383/2/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@566 PS2, Line 566: throw new TableLoadingException("Data file list collection failed."); > I think it would be useful to include the original exception as well, i.e. agree, done -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 06 Apr 2022 12:57:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/18383/2/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/18383/2/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@566 PS2, Line 566: throw new TableLoadingException("Data file list collection failed."); I think it would be useful to include the original exception as well, i.e. use: TableLoadingException(String s, Throwable cause) -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 06 Apr 2022 12:45:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8027/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 06 Apr 2022 12:00:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10405/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 06 Apr 2022 09:23:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Tamas Mate has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. IMPALA-11023: Raise error when delete file is found in an Iceberg table Iceberg V2 DeleteFiles are skipped during scans and the whole content of the DataFiles are returned. This commit adds an extra check to prevent scanning tables that have delete files to avoid unexpected results till merge on read is supported. Testing: - Added e2e test. Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e --- M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/data/README A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/0-0-fb178c51-b12a-4c5f-a66e-a8e9375daeba-1.parquet A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/00191-4-6e780302-527b-4911-8c6e-88d416adac57-1.parquet A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/0eadf173-0c84-4378-a9d0-5d7f47183978-m0.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/8cbef400-daea-478a-858a-2baf2438f644-m0.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-5725822353600261755-1-0eadf173-0c84-4378-a9d0-5d7f47183978.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-6816997371555012807-1-8cbef400-daea-478a-858a-2baf2438f644.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/v1.metadata.json A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/v2.metadata.json A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/version-hint.text M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test 14 files changed, 199 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/18383/2 -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10398/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 1 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 05 Apr 2022 13:31:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18383 ) Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. Patch Set 1: (1 comment) Thanks for fixing this! http://gerrit.cloudera.org:8080/#/c/18383/1/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/18383/1/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@556 PS1, Line 556: fileScanTasks Seems like we don't need this variable, the for-loop could just remain: for (FileScanTask task : scan.planFiles()) { -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 1 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 05 Apr 2022 13:18:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11023: Raise error when delete file is found in an Iceberg table
Tamas Mate has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18383 Change subject: IMPALA-11023: Raise error when delete file is found in an Iceberg table .. IMPALA-11023: Raise error when delete file is found in an Iceberg table Iceberg V2 DeleteFiles are skipped during scans and the whole content of the DataFiles are returned. This commit adds an extra check to prevent scanning tables that have delete files to avoid unexpected results till merge on read is supported. Testing: - Added e2e test. Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e --- M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/data/README A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/0-0-fb178c51-b12a-4c5f-a66e-a8e9375daeba-1.parquet A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/00191-4-6e780302-527b-4911-8c6e-88d416adac57-1.parquet A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/0eadf173-0c84-4378-a9d0-5d7f47183978-m0.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/8cbef400-daea-478a-858a-2baf2438f644-m0.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-5725822353600261755-1-0eadf173-0c84-4378-a9d0-5d7f47183978.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/snap-6816997371555012807-1-8cbef400-daea-478a-858a-2baf2438f644.avro A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/v1.metadata.json A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/v2.metadata.json A testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/metadata/version-hint.text M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test 14 files changed, 194 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/18383/1 -- To view, visit http://gerrit.cloudera.org:8080/18383 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e Gerrit-Change-Number: 18383 Gerrit-PatchSet: 1 Gerrit-Owner: Tamas Mate