[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9661/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 10 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 26 Oct 2021 06:55:22 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17964 ) Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7563/ -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 3 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 26 Oct 2021 06:38:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Yu-Wen Lai has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables To enable fine-grained table refreshing, there are three main changes in this commit. 1. Maintain validWriteIdList in Catalogd for transactional tables. We will keep track of write id changes for partitioned tables by AllocWriteIdEvents, CommitTxnEvents, and AbortTxnEvents. 2. Conduct partition level refreshing for transactional tables addPartitionEvents, dropPartitionEvents, and AlterPartitionEvents. 3. Introduce a config hms_event_incremental_refresh_transactional_table, which can switch on/off the fine-grained table refreshing. Performance Tests: A simple test was performed by running insert into one partition for partitioned ACID tables (50,000 partitions). Below are the time taken to refresh this table by the event. StorageBefore After = S3 50 secs 50 msecs local 3 secs 3 msecs Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/TableWriteId.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java M fe/src/main/java/org/apache/impala/hive/common/MutableValidWriteIdList.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A fe/src/test/java/org/apache/impala/catalog/CatalogTableWriteIdTest.java M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 17 files changed, 956 insertions(+), 46 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/17858/10 -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 10 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7565/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 10 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 26 Oct 2021 06:34:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17933 ) Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7564/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 Gerrit-Change-Number: 17933 Gerrit-PatchSet: 5 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 26 Oct 2021 02:19:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17933 ) Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 Gerrit-Change-Number: 17933 Gerrit-PatchSet: 5 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 26 Oct 2021 02:19:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17933 ) Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations .. Patch Set 4: Code-Review+2 Thanks for the patch! -- To view, visit http://gerrit.cloudera.org:8080/17933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 Gerrit-Change-Number: 17933 Gerrit-PatchSet: 4 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 26 Oct 2021 02:18:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17933 ) Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9660/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 Gerrit-Change-Number: 17933 Gerrit-PatchSet: 4 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 26 Oct 2021 02:12:30 + Gerrit-HasComments: No
[Impala-ASF-CR] Add TODO comment for future enhancement.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17967 ) Change subject: Add TODO comment for future enhancement. .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9659/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17967 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iada0f494baf680c3b33ee122552f0d49608feb67 Gerrit-Change-Number: 17967 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 26 Oct 2021 01:51:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/17933 ) Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations .. Patch Set 4: (6 comments) Thanks for review! http://gerrit.cloudera.org:8080/#/c/17933/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17933/3//COMMIT_MSG@10 PS3, Line 10: > nit: we use 72 chars width lines in commit messages Done http://gerrit.cloudera.org:8080/#/c/17933/3//COMMIT_MSG@11 PS3, Line 11: r > nit: are Done http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java File fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java: http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@26 PS3, Line 26: simplifi > nit: simplifies Done http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@42 PS3, Line 42: lengths are the > nit: lengths are the same Done http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@43 PS3, Line 43: precisions and scales are the > nit: precisions and scales are the same Done http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@50 PS3, Line 50: > nit: need one more space Done -- To view, visit http://gerrit.cloudera.org:8080/17933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 Gerrit-Change-Number: 17933 Gerrit-PatchSet: 4 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 26 Oct 2021 01:50:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations
Hello Quanlong Huang, Xianqing He, Zoltan Borok-Nagy, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17933 to look at the new patch set (#4). Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations .. IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations This patch adds a new expr rewrite rule to simplify some cast expr when cast target data type is the same as inner expr data type. We will remove unnecessary cast expr if any rules are matched. This kind of rewrite will improve query performance when casting a non-partition column, especially when scanning lots of data. Besides, cast expr in where clause can not pushdown to Kudu server, if we can remove unnecessary cast expr, Impala will pushdown this predicate to Kudu server, and this will save lots of time and IO/memmory. Testing: - Added unit test cases in `ExprRewriteRulesTest` Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java A fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java 3 files changed, 139 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/17933/4 -- To view, visit http://gerrit.cloudera.org:8080/17933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 Gerrit-Change-Number: 17933 Gerrit-PatchSet: 4 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 ) Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 22: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7562/ -- To view, visit http://gerrit.cloudera.org:8080/17859 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9 Gerrit-Change-Number: 17859 Gerrit-PatchSet: 22 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 26 Oct 2021 01:40:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10212. Support ofs scheme.
weic...@apache.org has posted comments on this change. ( http://gerrit.cloudera.org:8080/17963 ) Change subject: IMPALA-10212. Support ofs scheme. .. Patch Set 2: good idea. patch updated. -- To view, visit http://gerrit.cloudera.org:8080/17963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba Gerrit-Change-Number: 17963 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 26 Oct 2021 01:30:00 + Gerrit-HasComments: No
[Impala-ASF-CR] Add TODO comment for future enhancement.
weic...@apache.org has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17967 Change subject: Add TODO comment for future enhancement. .. Add TODO comment for future enhancement. Change-Id: Iada0f494baf680c3b33ee122552f0d49608feb67 --- M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java 1 file changed, 3 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/17967/1 -- To view, visit http://gerrit.cloudera.org:8080/17967 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iada0f494baf680c3b33ee122552f0d49608feb67 Gerrit-Change-Number: 17967 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17964 ) Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7563/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 3 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 26 Oct 2021 00:27:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. Patch Set 9: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7561/ -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 9 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Tue, 26 Oct 2021 00:27:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17933 ) Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations .. Patch Set 3: Code-Review+1 (1 comment) Thank Sheng for the changes! I'll bump to +2 after the comments are resolved. http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java File fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java: http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@50 PS3, Line 50: nit: need one more space -- To view, visit http://gerrit.cloudera.org:8080/17933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 Gerrit-Change-Number: 17933 Gerrit-PatchSet: 3 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 26 Oct 2021 00:10:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 ) Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout .. Patch Set 6: (3 comments) Thanks for adding the tests and query option. This is looking good, I only have a couple small comments. http://gerrit.cloudera.org:8080/#/c/17955/6/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/17955/6/be/src/service/client-request-state.cc@769 PS6, Line 769: DebugActionNoFail( : exec_request_->query_options, "CRS_DELAY_BEFORE_LOAD_DATA"); Nit: My only thought here is that I do like it when these statements are right next to the statement that we are simulating the delay about (in this case frontend_->LoadData()). http://gerrit.cloudera.org:8080/#/c/17955/6/tests/metadata/test_load.py File tests/metadata/test_load.py: http://gerrit.cloudera.org:8080/#/c/17955/6/tests/metadata/test_load.py@111 PS6, Line 111: class TestAsyncLoadData(TestLoadData): One thing about subclassing TestLoadData is that TestAsyncLoadData will get its own copy of test_load() from TestLoadData. When those copies execute in parallel, things can go wrong. One way out is to create a TestLoadDataBase that contains the pieces you need to share, and then subclass for both TestLoadData and TestAsyncLoadData. http://gerrit.cloudera.org:8080/#/c/17955/6/tests/metadata/test_load.py@122 PS6, Line 122: @pytest.mark.execute_serially # To avoid file copy failure: dst file does not exist Nice to have: When possible, we want to structure tests to allow them to execute in parallel. I ran test_async_load locally in its 6 variations, and it took about 6 minutes. I think a decent chunk of that was setup/teardown and not the test itself. In this case, it would involve replacing STAGING_PATH with something under the unique_database directory (and populating it with some files, etc). Unfortunately, unique_database doesn't really work with setup_method/teardown_method, so it would need some rework of populating the directory. -- To view, visit http://gerrit.cloudera.org:8080/17955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821 Gerrit-Change-Number: 17955 Gerrit-PatchSet: 6 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Mon, 25 Oct 2021 21:51:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 ) Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9658/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821 Gerrit-Change-Number: 17955 Gerrit-PatchSet: 6 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Mon, 25 Oct 2021 20:10:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10777: Enable min/max filtering for Iceberg partitions
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17960 ) Change subject: IMPALA-10777: Enable min/max filtering for Iceberg partitions .. Patch Set 3: (2 comments) Looks great! http://gerrit.cloudera.org:8080/#/c/17960/3/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17960/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@678 PS3, Line 678: if (!IsDataInDataFile(idx)) continue; nit. This call probably can wait after minmax_filter is obtained at line 680, at which point we can directly call minmax_filter->IsDataInDataFile(GetScanNodeId()). http://gerrit.cloudera.org:8080/#/c/17960/3/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/17960/3/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@238 PS3, Line 238: isDataInDataFile > nit: this was a bit ambiguous for me and had to read the comment of the isD +1. Sounds like a good idea. -- To view, visit http://gerrit.cloudera.org:8080/17960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac Gerrit-Change-Number: 17960 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 19:58:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 ) Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9657/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821 Gerrit-Change-Number: 17955 Gerrit-PatchSet: 5 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Mon, 25 Oct 2021 19:59:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 ) Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout .. Patch Set 6: Fix format error in test_load.py. -- To view, visit http://gerrit.cloudera.org:8080/17955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821 Gerrit-Change-Number: 17955 Gerrit-PatchSet: 6 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Mon, 25 Oct 2021 19:50:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout
Qifan Chen has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17955 ) Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout .. IMPALA-10967 Load data should handle AWS NLB-type timeout This patch addresses Impala client hang due to AWS network load balancer timeout which is fixed at 350s. When some long data loading operations are executing and the timeout happens, AWS silently drops the connection and the Impala client enters the hang state. The fix maintains the current TCLIService protocol between the client and Impala server and utilizes a separate thread to run the data loading and metadata refresh operation. Since this thread is waited for in a wait thread which runs asynchronously, the execution of the entire operation will not cause a wait on the Impala client. The Impala client can check the status of the operation via repeated GetOperationStatus() call. External behavior change: 1. A new query option 'enable_async_load_data_execution', default to true, is added. It can be set to false to turn off the patch. Testing: 1. Added a new test in test_load.py to verify that the asynchronous execution in BE keeps the session live; 2. Ran core tests successfully. Change-Id: I8c2437e9894510204303ec07710cad60102c8821 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M tests/metadata/test_load.py 7 files changed, 170 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17955/6 -- To view, visit http://gerrit.cloudera.org:8080/17955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821 Gerrit-Change-Number: 17955 Gerrit-PatchSet: 6 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen
[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 ) Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout .. Patch Set 5: Added the logic to disable the feature, and a new state/timing test in test_load.py. -- To view, visit http://gerrit.cloudera.org:8080/17955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821 Gerrit-Change-Number: 17955 Gerrit-PatchSet: 5 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Mon, 25 Oct 2021 19:38:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout
Qifan Chen has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17955 ) Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout .. IMPALA-10967 Load data should handle AWS NLB-type timeout This patch addresses Impala client hang due to AWS network load balancer timeout which is fixed at 350s. When some long data loading operations are executing and the timeout happens, AWS silently drops the connection and the Impala client enters the hang state. The fix maintains the current TCLIService protocol between the client and Impala server and utilizes a separate thread to run the data loading and metadata refresh operation. Since this thread is waited for in a wait thread which runs asynchronously, the execution of the entire operation will not cause a wait on the Impala client. The Impala client can check the status of the operation via repeated GetOperationStatus() call. External behavior change: 1. A new query option 'enable_async_load_data_execution', default to true, is added. It can be set to false to turn off the patch. Testing: 1. Added a new test in test_load.py to verify that the asynchronous execution in BE keeps the session live; 2. Ran core tests successfully. Change-Id: I8c2437e9894510204303ec07710cad60102c8821 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M tests/metadata/test_load.py 7 files changed, 170 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17955/5 -- To view, visit http://gerrit.cloudera.org:8080/17955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821 Gerrit-Change-Number: 17955 Gerrit-PatchSet: 5 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen
[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17955 ) Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/17955/5/tests/metadata/test_load.py File tests/metadata/test_load.py: http://gerrit.cloudera.org:8080/#/c/17955/5/tests/metadata/test_load.py@20 PS5, Line 20: import sys flake8: F401 'sys' imported but unused http://gerrit.cloudera.org:8080/#/c/17955/5/tests/metadata/test_load.py@110 PS5, Line 110: @SkipIfLocal.hdfs_client flake8: E302 expected 2 blank lines, found 1 -- To view, visit http://gerrit.cloudera.org:8080/17955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821 Gerrit-Change-Number: 17955 Gerrit-PatchSet: 5 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Mon, 25 Oct 2021 19:38:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 ) Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 22: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7562/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17859 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9 Gerrit-Change-Number: 17859 Gerrit-PatchSet: 22 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 25 Oct 2021 19:37:09 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17964 ) Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 2 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 25 Oct 2021 19:37:07 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17964 ) Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9656/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 3 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 25 Oct 2021 18:35:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 ) Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 22: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9655/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17859 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9 Gerrit-Change-Number: 17859 Gerrit-PatchSet: 22 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 25 Oct 2021 18:33:42 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17964 ) Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/17964/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/17964/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5877 PS3, Line 5877: updatedThriftTable = catalog_.reloadTable(tbl, req, resultType, cmdString, -1); line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/17964/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/17964/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2402 PS3, Line 2402: batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics()); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17964/3/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java: http://gerrit.cloudera.org:8080/#/c/17964/3/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85 PS3, Line 85: private static boolean flagEnableCatalogCache ,flagInvalidateCache, flagSyncToLatestEventId; line too long (96 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 3 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 25 Oct 2021 18:14:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7561/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 9 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 25 Oct 2021 18:14:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 ) Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 21: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17859 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9 Gerrit-Change-Number: 17859 Gerrit-PatchSet: 21 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 25 Oct 2021 18:14:27 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17964 to look at the new patch set (#3). Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/custom_cluster/test_metastore_service.py 26 files changed, 3,404 insertions(+), 290 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/17964/3 -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 3 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 ) Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 22: (4 comments) http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@63 PS22, Line 63: // import org.apache.impala.catalog.events.MetastoreEvents.EventFactoryForSyncToLatestEvent; line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2412 PS22, Line 2412: batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics()); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java: http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85 PS22, Line 85: private static boolean flagEnableCatalogCache ,flagInvalidateCache, flagSyncToLatestEventId; line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java: http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java@28 PS22, Line 28: //import org.apache.impala.catalog.events.MetastoreEvents.EventFactoryForSyncToLatestEvent; line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17859 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9 Gerrit-Change-Number: 17859 Gerrit-PatchSet: 22 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 25 Oct 2021 18:11:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Hello Vihang Karajgaonkar, kis...@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17859 to look at the new patch set (#22). Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints Change-Id: I36364e401911352c474eb98c8d61bbaae9b9 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/custom_cluster/test_metastore_service.py 26 files changed, 3,398 insertions(+), 290 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/22 -- To view, visit http://gerrit.cloudera.org:8080/17859 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9 Gerrit-Change-Number: 17859 Gerrit-PatchSet: 22 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9654/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 9 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 25 Oct 2021 17:49:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10212. Support ofs scheme.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17963 ) Change subject: IMPALA-10212. Support ofs scheme. .. Patch Set 2: Code-Review+1 (1 comment) This looks good to me. Thanks for putting this together. I had one minor nit. http://gerrit.cloudera.org:8080/#/c/17963/2/fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java File fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java: http://gerrit.cloudera.org:8080/#/c/17963/2/fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java@121 PS2, Line 121: // testIsSupportStorageIds(mockLocation(FileSystemUtil.SCHEME_O3FS), true); Nit: There are a few of these commented out O3FS tests that we would like to enable later. Can you add an OFS equivalent for each one (still commented out)? -- To view, visit http://gerrit.cloudera.org:8080/17963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba Gerrit-Change-Number: 17963 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Mon, 25 Oct 2021 17:37:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Yu-Wen Lai has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables To enable fine-grained table refreshing, there are three main changes in this commit. 1. Maintain validWriteIdList in Catalogd for transactional tables. We will keep track of write id changes for partitioned tables by AllocWriteIdEvents, CommitTxnEvents, and AbortTxnEvents. 2. Conduct partition level refreshing for transactional tables addPartitionEvents, dropPartitionEvents, and AlterPartitionEvents. 3. Introduce a config hms_event_incremental_refresh_transactional_table, which can switch on/off the fine-grained table refreshing. Performance Tests: A simple test was performed by running insert into one partition for partitioned ACID tables (50,000 partitions). Below are the time taken to refresh this table by the event. StorageBefore After = S3 50 secs 50 msecs local 3 secs 3 msecs Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/TableWriteId.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java M fe/src/main/java/org/apache/impala/hive/common/MutableValidWriteIdList.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A fe/src/test/java/org/apache/impala/catalog/CatalogTableWriteIdTest.java M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 17 files changed, 938 insertions(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/17858/9 -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 9 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai
[Impala-ASF-CR] IMPALA-10777: Enable min/max filtering for Iceberg partitions
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/17960 ) Change subject: IMPALA-10777: Enable min/max filtering for Iceberg partitions .. Patch Set 3: (2 comments) Hi Zoltan, Added a readability nit and a test comment, apart from these LGTM. http://gerrit.cloudera.org:8080/#/c/17960/3/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/17960/3/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@238 PS3, Line 238: isDataInDataFile nit: this was a bit ambiguous for me and had to read the comment of the isDataInDataFile method to understand it. What do you think about using something like: isPartitionColumnValuesInDataFile, isPartitionValuesInDataFile or isPartColValInDataFile http://gerrit.cloudera.org:8080/#/c/17960/3/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test File testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test: http://gerrit.cloudera.org:8080/#/c/17960/3/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@429 PS3, Line 429: select * from functional_parquet.iceberg_partitioned i1, Missing > SET RUNTIME_FILTER_WAIT_TIME_MS=$RUNTIME_FILTER_WAIT_TIME_MS; -- To view, visit http://gerrit.cloudera.org:8080/17960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac Gerrit-Change-Number: 17960 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 16:02:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10777: Enable min/max filtering for Iceberg partitions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17960 ) Change subject: IMPALA-10777: Enable min/max filtering for Iceberg partitions .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9653/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac Gerrit-Change-Number: 17960 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 15:59:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10777: Enable min/max filtering for Iceberg partitions
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17960 ) Change subject: IMPALA-10777: Enable min/max filtering for Iceberg partitions .. Patch Set 1: (1 comment) Thanks for the comments. http://gerrit.cloudera.org:8080/#/c/17960/1/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17960/1/be/src/exec/parquet/hdfs-parquet-scanner.cc@1323 PS1, Line 1323: if (scan_node_->hdfs_table()->IsIcebergTable()) return false; > I got it. Thanks for the explanation. Added is_data_in_file to TRuntimeFilterTargetDesc. -- To view, visit http://gerrit.cloudera.org:8080/17960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac Gerrit-Change-Number: 17960 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 15:39:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10777: Enable min/max filtering for Iceberg partitions
Hello Tamas Mate, Qifan Chen, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17960 to look at the new patch set (#3). Change subject: IMPALA-10777: Enable min/max filtering for Iceberg partitions .. IMPALA-10777: Enable min/max filtering for Iceberg partitions This patch enables min/max filters for Iceberg columns that participate in table partitioning. The min/max filters are evaluated at the Parquet row group level. This means that it is still slower than dynamic partition pruning (which doesn't even need to open the files), but much faster than no pruning at all. Performance I used the following query to measure perf on a scale 10 TPC-DS dataset: select i_item_id,sum(ss_ext_sales_price) total_sales from store_sales, date_dim, customer_address, item where i_item_id in (select i_item_id from item where i_color in ('orchid','chiffon','lace')) and ss_item_sk = i_item_sk and ss_sold_date_sk = d_date_sk and d_year = 2000 and d_moy = 1 and ss_addr_sk = ca_address_sk and ca_gmt_offset = -8 The above query took the following times to execute: Regular Parquet table: 1.16s Iceberg table without min/max filters: 4.39s Iceberg table with min/max filters: 1.77s Testing: * added e2e test * planner test could not be added because Iceberg tables behave differently during planner tests (due to some hacks that needs refactoring) Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/runtime/runtime-filter.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/FeTable.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test 8 files changed, 80 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17960/3 -- To view, visit http://gerrit.cloudera.org:8080/17960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac Gerrit-Change-Number: 17960 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17933 ) Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations .. Patch Set 3: Code-Review+1 (5 comments) Just found some nits/grammatical errors, otherwise LGTM! http://gerrit.cloudera.org:8080/#/c/17933/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17933/3//COMMIT_MSG@10 PS3, Line 10: e nit: we use 72 chars width lines in commit messages http://gerrit.cloudera.org:8080/#/c/17933/3//COMMIT_MSG@11 PS3, Line 11: is nit: are http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java File fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java: http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@26 PS3, Line 26: simplify nit: simplifies http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@42 PS3, Line 42: length are same. nit: lengths are the same http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@43 PS3, Line 43: precision and scale are same. nit: precisions and scales are the same -- To view, visit http://gerrit.cloudera.org:8080/17933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 Gerrit-Change-Number: 17933 Gerrit-PatchSet: 3 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 25 Oct 2021 14:01:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17964 ) Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9652/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 2 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 25 Oct 2021 13:41:12 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17964 to look at the new patch set (#2). Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/custom_cluster/test_metastore_service.py 26 files changed, 3,403 insertions(+), 289 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/17964/2 -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 2 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17964 ) Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7560/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 2 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 25 Oct 2021 13:20:33 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17964 ) Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/17964/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/17964/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5877 PS2, Line 5877: updatedThriftTable = catalog_.reloadTable(tbl, req, resultType, cmdString, -1); line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/17964/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/17964/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2402 PS2, Line 2402: batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics()); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17964/2/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java: http://gerrit.cloudera.org:8080/#/c/17964/2/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85 PS2, Line 85: private static boolean flagEnableCatalogCache ,flagInvalidateCache, flagSyncToLatestEventId; line too long (96 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 2 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 25 Oct 2021 13:20:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17933 ) Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9651/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 Gerrit-Change-Number: 17933 Gerrit-PatchSet: 3 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 25 Oct 2021 13:15:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/17933 ) Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations .. Patch Set 3: (9 comments) Thanks for carefully review, Quanlong! http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@7 PS2, Line 7: SimplifyCastExprRule > Can we rename it to 'SimplifyCastExprRule'? We already have some similar na Done http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@9 PS2, Line 9: add > nit: adds Done http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@11 PS2, Line 11: rules i > nit: is the same Done http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@12 PS2, Line 12: ing a n > nit: is the same Done http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@14 PS2, Line 14: move unnecessar > nit: if any rules is matched Done http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@15 PS2, Line 15: and > nit: casting Done http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@16 PS2, Line 16: time and IO/memmory. > nit: , especially when scanning lots of data. Done http://gerrit.cloudera.org:8080/#/c/17933/2/fe/src/main/java/org/apache/impala/rewrite/CastExprSimplifyRule.java File fe/src/main/java/org/apache/impala/rewrite/CastExprSimplifyRule.java: http://gerrit.cloudera.org:8080/#/c/17933/2/fe/src/main/java/org/apache/impala/rewrite/CastExprSimplifyRule.java@56 PS2, Line 56: > Can we remove this check? So the rule can apply to more scenarios, e.g. CAS Done http://gerrit.cloudera.org:8080/#/c/17933/2/fe/src/main/java/org/apache/impala/rewrite/CastExprSimplifyRule.java@56 PS2, Line 56: : : : : : : : : : > I think we can merge these two branches into one. We just check the first c Done -- To view, visit http://gerrit.cloudera.org:8080/17933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 Gerrit-Change-Number: 17933 Gerrit-PatchSet: 3 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 25 Oct 2021 12:54:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations
wangsheng has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17933 ) Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations .. IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations This patch adds a new expr rewrite rule to simplify some cast expr when cast target data type is the same as inner expr data type. We will remove unnecessary cast expr if any rules is matched. This kind of rewrite will improve query performance when casting a non-partition column, especially when scanning lots of data. Besides, cast expr in where clause can not pushdown to Kudu server, if we can remove unnecessary cast expr, Impala will pushdown this predicate to Kudu server, and this will save lots of time and IO/memmory. Testing: - Added unit test cases in `ExprRewriteRulesTest` Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java A fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java 3 files changed, 139 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/17933/3 -- To view, visit http://gerrit.cloudera.org:8080/17933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954 Gerrit-Change-Number: 17933 Gerrit-PatchSet: 3 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xianqing He Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17964 ) Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9650/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 25 Oct 2021 12:24:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 ) Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 21: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9649/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17859 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9 Gerrit-Change-Number: 17859 Gerrit-PatchSet: 21 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 25 Oct 2021 12:22:43 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Sourabh Goyal has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17964 Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/custom_cluster/test_metastore_service.py 26 files changed, 3,362 insertions(+), 282 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/17964/1 -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal
[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Hello Vihang Karajgaonkar, kis...@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17859 to look at the new patch set (#21). Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints Change-Id: I36364e401911352c474eb98c8d61bbaae9b9 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/custom_cluster/test_metastore_service.py 26 files changed, 3,397 insertions(+), 289 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/21 -- To view, visit http://gerrit.cloudera.org:8080/17859 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9 Gerrit-Change-Number: 17859 Gerrit-PatchSet: 21 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai
[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 ) Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 21: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7559/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17859 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9 Gerrit-Change-Number: 17859 Gerrit-PatchSet: 21 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 25 Oct 2021 12:03:44 + Gerrit-HasComments: No
[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17964 ) Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/17964/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/17964/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2402 PS1, Line 2402: batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics()); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17964/1/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java: http://gerrit.cloudera.org:8080/#/c/17964/1/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85 PS1, Line 85: private static boolean flagEnableCatalogCache ,flagInvalidateCache, flagSyncToLatestEventId; line too long (96 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17964 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6 Gerrit-Change-Number: 17964 Gerrit-PatchSet: 1 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 25 Oct 2021 12:03:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 ) Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints .. Patch Set 21: (4 comments) http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@63 PS21, Line 63: // import org.apache.impala.catalog.events.MetastoreEvents.EventFactoryForSyncToLatestEvent; line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2412 PS21, Line 2412: batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics()); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java: http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85 PS21, Line 85: private static boolean flagEnableCatalogCache ,flagInvalidateCache, flagSyncToLatestEventId; line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java: http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java@28 PS21, Line 28: //import org.apache.impala.catalog.events.MetastoreEvents.EventFactoryForSyncToLatestEvent; line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17859 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9 Gerrit-Change-Number: 17859 Gerrit-PatchSet: 21 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 25 Oct 2021 12:00:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9648/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 12 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 11:20:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 12: Fix Jenkins indent comments. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 12 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 10:58:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less than 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: 1. Ran existing tests 2. Added UT for 'ScratchTupleBatch::GetMicroBatch' Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test 20 files changed, 935 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/12 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 12 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9647/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 11 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 10:51:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/17860/10//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17860/10//COMMIT_MSG@19 PS10, Line 19: than > nit: than Done -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 11 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 10:46:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 11: > (10 comments) > > Looks great! > > On testing, I wonder if we can add a counter on # of rows (or > amount of data) not surviving the materialization. This will be > useful to safe guard the feature and demonstrate its usefulness. Thanks Qifan for the review and the suggestion of counter is good and something I pondered about earlier. Issue is that we don't skip decoding rows, instead we skip decoding values where one row may constitute hundreds of values out of which some will be read and others might be skipped. But we cannot accurately keep track number of values being skipped in current scheme of things without incurring significant performance penalty. For instance, we sometimes skip pages without decompressing it - if skipped page has page index with candidate rows we will need to decompress the page to get the accurate values skipped due to late materialisation. In that scenario where we directly skip pages, even if page is not compressed, figuring out number of values for corresponding candidate range can be time consuming. Hence, using timed counters would be more appropriate here, which are already present. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 11 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 10:45:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 11: (10 comments) http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/parquet/hdfs-parquet-scanner.cc@2223 PS10, Line 2223: c. > Could you please explain where do we filter out the rows in the merged micr We don't need to re-filter after step 3. I will explain it in comment. http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch-test.cc File be/src/exec/scratch-tuple-batch-test.cc: http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch-test.cc@67 PS10, Line 67: scratch_batch->num_tuples = BATCH_ > I wonder if we can add two more tests for the following situations. Done http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h File be/src/exec/scratch-tuple-batch.h: http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@29 PS10, Line 29: ScratchMicroBatch > May need a cstr to properly init these fields. Using aggregate initialisers instead of constructor accepting arguments as we need default constructor too. Plus we don't want many function calls on hot path (GetMicroBatches). http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@171 PS10, Line 171: /// bits set are used to create micro batches. Micro batches that differ by less than > nit (or micro batches). Done http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@176 PS10, Line 176: present > nit. Done http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@178 PS10, Line 178: batch > nit. batch_idx may be a better name in this method. Done http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@203 PS10, Line 203: << "should be true"; : /// Add the last micro batch which was b > nit. An alternative is the following, which is more robust. We can avoid that extra branch and condition and also extra condition on client side to handle 0 being returned, as it is anyways going to be dead code and also mentioned as precondition for method. DCHECK is to ensure that precondition and in future this dead code doesn't get activated. http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/service/query-options.h@50 PS10, Line 50: PARQUET_LATE_MATERIALIZATION_THRE > nit: PARQUET_LATE_MATERIALIZATION_THRESHOLD? Done http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/ImpalaService.thrift@701 PS10, Line 701: ENABLE_ASYNC_DDL_EXECUTION = 136 > nit. -1 to turn off the feature. Done http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/Query.thrift File common/thrift/Query.thrift: http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/Query.thrift@554 PS10, Line 554: 137: optional bool enable_async_ddl_execution = true; > nit. -1 to turn off the feature. Done -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 11 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 10:30:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less than 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: 1. Ran existing tests 2. Added UT for 'ScratchTupleBatch::GetMicroBatch' Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test 20 files changed, 933 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/11 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 11 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 11: (3 comments) http://gerrit.cloudera.org:8080/#/c/17860/11/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/11/be/src/exec/parquet/hdfs-parquet-scanner.cc@2291 PS11, Line 2291: int num_micro_batches = scratch_batch_->GetMicroBatches(late_materialization_threshold_, line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/17860/11/be/src/exec/scratch-tuple-batch-test.cc File be/src/exec/scratch-tuple-batch-test.cc: http://gerrit.cloudera.org:8080/#/c/17860/11/be/src/exec/scratch-tuple-batch-test.cc@84 PS11, Line 84: EXPECT_EQ(scratch_batch->GetMicroBatches(10 /*Skip Length*/, micro_batches), 1024/n); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17860/11/be/src/exec/scratch-tuple-batch-test.cc@116 PS11, Line 116: EXPECT_EQ(scratch_batch->GetMicroBatches(10 /*Skip Length*/, micro_batches), 1024/(n * 2)); line too long (95 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 11 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 10:30:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10212. Support ofs scheme.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17963 ) Change subject: IMPALA-10212. Support ofs scheme. .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9646/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba Gerrit-Change-Number: 17963 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 25 Oct 2021 09:12:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17858 ) Change subject: IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables .. Patch Set 8: Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7558/ -- To view, visit http://gerrit.cloudera.org:8080/17858 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9 Gerrit-Change-Number: 17858 Gerrit-PatchSet: 8 Gerrit-Owner: Yu-Wen Lai Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yu-Wen Lai Gerrit-Comment-Date: Mon, 25 Oct 2021 09:06:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10212. Support ofs scheme.
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17963 to look at the new patch set (#2). Change subject: IMPALA-10212. Support ofs scheme. .. IMPALA-10212. Support ofs scheme. OFS is the new file system implementation for Ozone. The biggest difference compared to o3fs is that ofs supports operations across all volumes and buckets and provides a full view of all the volume/buckets. It uses the same transport as o3fs and therefore it shares the thread pool with o3fs. How it was tested: The patch was tested manually on a CDPD cluster, loaded TPC-DS data, ran TPC-DS, ran 'load data inpath' command. Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba --- M be/src/util/hdfs-util.cc M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java 3 files changed, 16 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/17963/2 -- To view, visit http://gerrit.cloudera.org:8080/17963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba Gerrit-Change-Number: 17963 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10212. Support ofs scheme.
weic...@apache.org has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17963 Change subject: IMPALA-10212. Support ofs scheme. .. IMPALA-10212. Support ofs scheme. OFS is the new file system implementation for Ozone. The biggest difference compared to o3fs is that ofs supports operations across all volumes and buckets and provides a full view of all the volume/buckets. It uses the same transport as o3fs and therefore it shares the thread pool with o3fs. Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba --- M be/src/util/hdfs-util.cc M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java 3 files changed, 16 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/17963/1 -- To view, visit http://gerrit.cloudera.org:8080/17963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba Gerrit-Change-Number: 17963 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward