[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction After compaction happened in Hive(HIVE ACID table), queries made in Impala possibly fail with a FileNotFoundException if files already removed by the Hive cleaner. In IMPALA-10801, catalogd checks the latest compaction id before serving metadata. However, coordinators don't take advantage of that. Coordinators have their own local cache, so we will have to do the same check for coordinators as well. Besides, we also need to attach writeIdList to requests that need to fetch file metadata. Since this checking brings additional overhead for queries, we introduce a flag auto_check_compaction and set it as false by default for now. We will find some other efficient ways to do compaction checking in the future. Tests: Added unit tests to CatalogdMetaProviderTest Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Reviewed-on: http://gerrit.cloudera.org:8080/18043 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CompactionInfoLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java 12 files changed, 356 insertions(+), 14 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 7 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 6 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 01 Dec 2021 12:51:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7685/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 6 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 01 Dec 2021 06:30:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 6 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 01 Dec 2021 06:30:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 5: Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7684/ -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 5 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 01 Dec 2021 06:01:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Yu-Wen Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 5: There is one test failed at "Rows Processed" check in Dockerised-test but it seems similar to https://issues.apache.org/jira/browse/IMPALA-6004. It seems irrelevant to the patch. Other failures in "ubuntu-16.04-from-scratch" didn't exist in one previous build so they might be flasky. A previous run of the same patch passed at: https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15371/. -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 5 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 23:47:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7684/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 5 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 23:38:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 5: Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7683/ -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 5 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 23:30:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 5: Looks like the previous test failures were due to the change related to the flag. Once the gerrit job comes back with a +1 I can merge the patch. -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 5 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 18:26:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 5 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 18:25:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7683/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 5 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 16:31:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7682/ -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 5 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 11:13:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 4: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7681/ -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 4 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 06:43:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7680/ -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 3 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 05:47:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9851/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 5 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 05:12:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7682/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 5 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 04:58:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Yu-Wen Lai has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction After compaction happened in Hive(HIVE ACID table), queries made in Impala possibly fail with a FileNotFoundException if files already removed by the Hive cleaner. In IMPALA-10801, catalogd checks the latest compaction id before serving metadata. However, coordinators don't take advantage of that. Coordinators have their own local cache, so we will have to do the same check for coordinators as well. Besides, we also need to attach writeIdList to requests that need to fetch file metadata. Since this checking brings additional overhead for queries, we introduce a flag auto_check_compaction and set it as false by default for now. We will find some other efficient ways to do compaction checking in the future. Tests: Added unit tests to CatalogdMetaProviderTest Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b --- M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CompactionInfoLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java 12 files changed, 356 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18043/5 -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 5 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9850/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 4 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 00:27:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7681/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 4 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 00:17:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 4: Code-Review+2 The patch looks good to me. I think we should create separate followups for the looking it how we can fix this for cases where compactor cleans up the files just after the query is compiled. But since cleaner runs after some configurable delay after compaction, the queries which run for longer duration than that delay could be still affected by this issue. -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 4 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 00:16:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Yu-Wen Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc@348 PS3, Line 348: conduct > nit, conducted Ack http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc@349 PS3, Line 349: m > move to previous line? Ack http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc@349 PS3, Line 349: ala makes " : "additional RPCs to hive metastore for each table > suggest you to change it to more generic since end users may not understand Ack -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 4 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 30 Nov 2021 00:03:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Yu-Wen Lai has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction After compaction happened in Hive(HIVE ACID table), queries made in Impala possibly fail with a FileNotFoundException if files already removed by the Hive cleaner. In IMPALA-10801, catalogd checks the latest compaction id before serving metadata. However, coordinators don't take advantage of that. Coordinators have their own local cache, so we will have to do the same check for coordinators as well. Besides, we also need to attach writeIdList to requests that need to fetch file metadata. Since this checking brings additional overhead for queries, we introduce a flag auto_check_compaction and set it as false by default for now. We will find some other efficient ways to do compaction checking in the future. Tests: Added unit tests to CatalogdMetaProviderTest Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b --- M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CompactionInfoLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java 12 files changed, 337 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18043/4 -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 4 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9849/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 3 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 29 Nov 2021 23:54:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc@348 PS3, Line 348: conduct nit, conducted http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc@349 PS3, Line 349: . move to previous line? http://gerrit.cloudera.org:8080/#/c/18043/3/be/src/service/impala-server.cc@349 PS3, Line 349: sends one " : "getLatestCompactionInfo request per table to HMS. suggest you to change it to more generic since end users may not understand these details. Something like "... because Impala makes additional RPCs to hive metastore for each table in a query during the query compilation" http://gerrit.cloudera.org:8080/#/c/18043/3/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/18043/3/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@530 PS3, Line 530: checkLatestCompaction Logging the time taken for this method will be helpful. http://gerrit.cloudera.org:8080/#/c/18043/3/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/18043/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1755 PS3, Line 1755: if (BackendConfig.INSTANCE.isAutoCheckCompaction() && analysisResult.isQueryStmt() : && !queryCtx.isSetTransaction_id()) { : long txnId = openTransaction(queryCtx); : timeline.markEvent("Transaction opened (" + String.valueOf(txnId) + ")"); : } If we open a transaction here then we must make sure that we also commit it when the query completes. I suggest you do it in a followup since I don't think the backend calls back into frontend when the query completes and it is not as straight-forward. -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 3 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 29 Nov 2021 23:48:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Yu-Wen Lai has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction After compaction happened in Hive(HIVE ACID table), queries made in Impala possibly fail with a FileNotFoundException if files already removed by the Hive cleaner. In IMPALA-10801, catalogd checks the latest compaction id before serving metadata. However, coordinators don't take advantage of that. Coordinators have their own local cache, so we will have to do the same check for coordinators as well. Besides, we also need to attach writeIdList to requests that need to fetch file metadata. Since this checking brings additional overhead for queries, we introduce a flag auto_check_compaction and set it as false by default for now. We will find some other efficient ways to do compaction checking in the future. Tests: Added unit tests to CatalogdMetaProviderTest Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b --- M be/src/service/impala-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CompactionInfoLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java 13 files changed, 335 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18043/3 -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 3 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Yu-Wen Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/18043/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18043/2//COMMIT_MSG@10 PS2, Line 10: After compaction happened in Hive(HIVE ACID table), queries made in : Impala possibly fail with a FileNotFoundException if files already : removed by the Hive cleaner. > IIRC, Impala only open transactions for DDL/DML operations. Do you know how Thank Vihang and Quanlong for letting me know the problem. Impala does NOT open transactions for select queries so this approach doesn't work all the time... Hive has a config that can delay the cleaner some period of time but we don't know exactly how long we should extend. Given that this is time sensitive, I'm thinking we could make this feature optional for now. If this flag is set, say auto_check_compaction, let Impala open transactions for all the queries for ACID tables and do the compaction checking. Any thoughts? http://gerrit.cloudera.org:8080/#/c/18043/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/18043/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@898 PS2, Line 898: List stalePartitions = directProvider_.checkLatestCompaction( : refImpl.dbName_, refImpl.tableName_, refImpl, refToMeta); > I think this introduces several HMS RPCs per query (some queries may call t If we take the performance numbers on DWX as example, currently this API call takes 10 ~ 40 ms per table depending on the number of partitions. I will have a fix on the HMS side to solve an issue around this API that we need to pass all the partition names. That should make all the API execution time close to 10 ms. Even though we can make some improvement around this API, I understand this is still introduce the overhead that might not neglectable. It might be better to introduce this feature with a flag and the table property to skip this check as Quanlong suggested. -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 2 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 29 Nov 2021 02:56:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/18043/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18043/2//COMMIT_MSG@10 PS2, Line 10: After compaction happened in Hive(HIVE ACID table), queries made in : Impala possibly fail with a FileNotFoundException if files already : removed by the Hive cleaner. > Can you confirm if Impala open's a transaction for select queries for ACID IIRC, Impala only open transactions for DDL/DML operations. Do you know how long Hive will remove files after compaction? http://gerrit.cloudera.org:8080/#/c/18043/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/18043/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@898 PS2, Line 898: List stalePartitions = directProvider_.checkLatestCompaction( : refImpl.dbName_, refImpl.tableName_, refImpl, refToMeta); > looks like this is going to be called during each query's compilation when I think this introduces several HMS RPCs per query (some queries may call this multiple times). Maybe we can add a query option or table property to skip the check so ACID tables that are not frequently updated/compacted can skip this. We can use notification-based solution (in follow-up JIRAs) for those tables. http://gerrit.cloudera.org:8080/#/c/18043/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@951 PS2, Line 951: req.table_info_selector.valid_write_ids = table.validWriteIds_; With this change, catalogd will check latest compaction ids for each request. I think we need a follow-up JIRA for perf-test to measure the overhead, especially for tables with large number of partitions. -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 2 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Thu, 25 Nov 2021 08:20:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/18043/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18043/2//COMMIT_MSG@10 PS2, Line 10: After compaction happened in Hive(HIVE ACID table), queries made in : Impala possibly fail with a FileNotFoundException if files already : removed by the Hive cleaner. Can you confirm if Impala open's a transaction for select queries for ACID tables? My understanding is that this approach would only work as long as the select query opens a transaction so that a compaction which runs immediately after we check does not remove the files after we mark them as not stale. http://gerrit.cloudera.org:8080/#/c/18043/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/18043/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@898 PS2, Line 898: List stalePartitions = directProvider_.checkLatestCompaction( : refImpl.dbName_, refImpl.tableName_, refImpl, refToMeta); looks like this is going to be called during each query's compilation when the query references a ACID table. Do you have a rough estimate of how much overhead per query are we introducing here? http://gerrit.cloudera.org:8080/#/c/18043/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/18043/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@525 PS2, Line 525: /** :* Fetches the latest compaction id from HMS and compares with partition metadata in :* cache. If a partition is stale due to compaction, removes it from metas. :*/ please also mention what is it that this method returns. Looks like we are returning all the PartitionRef which are deemed to be stale due to compaction. -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 2 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Wed, 24 Nov 2021 20:00:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 2 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 20 Nov 2021 12:03:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9820/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 2 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 20 Nov 2021 06:14:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7652/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 2 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 20 Nov 2021 05:52:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Yu-Wen Lai has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction After compaction happened in Hive(HIVE ACID table), queries made in Impala possibly fail with a FileNotFoundException if files already removed by the Hive cleaner. In IMPALA-10801, catalogd checks the latest compaction id before serving metadata. However, coordinators don't take advantage of that. Coordinators have their own local cache, so we will have to do the same check for coordinators as well. Besides, we also need to attach writeIdList to requests that need to fetch file metadata. Tests: Added unit tests to CatalogdMetaProviderTest Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b --- M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CompactionInfoLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java 8 files changed, 308 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18043/2 -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 2 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7651/ -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 1 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 20 Nov 2021 00:50:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9816/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 1 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 19 Nov 2021 18:46:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18043 ) Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7651/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 1 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 19 Nov 2021 18:24:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction
Yu-Wen Lai has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18043 Change subject: IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction .. IMPALA-11032: Automatic Refresh of Metadata for Local Catalog after Compaction After compaction happened in Hive(HIVE ACID table), queries made in Impala possibly fail with a FileNotFoundException if files already removed by the Hive cleaner. In IMPALA-10801, catalogd checks the latest compaction id before serving metadata. However, coordinators don't take advantage of that. Coordinators have their own local cache, so we will have to do the same check for coordinators as well. Besides, we also need to attach writeIdList to requests that need to fetch file metadata. Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b --- M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CompactionInfoLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java 8 files changed, 303 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18043/1 -- To view, visit http://gerrit.cloudera.org:8080/18043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I173ea848917b6a41139b25b80677111463bfdc4b Gerrit-Change-Number: 18043 Gerrit-PatchSet: 1 Gerrit-Owner: Yu-Wen Lai