[Impala-ASF-CR] [PROTOTYPE] IMPALA-9923: Test out super ugly hack to load ORC serially
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16292 ) Change subject: [PROTOTYPE] IMPALA-9923: Test out super ugly hack to load ORC serially .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6794/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15eff1ec6cab32c1216ed7400e4c4b57bb81e4cd Gerrit-Change-Number: 16292 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 05 Aug 2020 06:28:53 + Gerrit-HasComments: No
[Impala-ASF-CR] Add logging when query unregisters
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16285 ) Change subject: Add logging when query unregisters .. Add logging when query unregisters This adds a log line which is printed when a query is successfully unregistered by the async unregister thread pool. Added only for additional observability. Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0 Reviewed-on: http://gerrit.cloudera.org:8080/16285 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/service/impala-server.cc 1 file changed, 6 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16285 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0 Gerrit-Change-Number: 16285 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Add logging when query unregisters
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16285 ) Change subject: Add logging when query unregisters .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16285 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0 Gerrit-Change-Number: 16285 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 06:13:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10047: Revert core piece of IMPALA-6984
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16288 ) Change subject: IMPALA-10047: Revert core piece of IMPALA-6984 .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6227/ -- To view, visit http://gerrit.cloudera.org:8080/16288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8 Gerrit-Change-Number: 16288 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 06:03:19 + Gerrit-HasComments: No
[Impala-ASF-CR] [PROTOTYPE] IMPALA-9923: Test out super ugly hack to load ORC serially
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16292 ) Change subject: [PROTOTYPE] IMPALA-9923: Test out super ugly hack to load ORC serially .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6230/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15eff1ec6cab32c1216ed7400e4c4b57bb81e4cd Gerrit-Change-Number: 16292 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 05 Aug 2020 06:02:59 + Gerrit-HasComments: No
[Impala-ASF-CR] [PROTOTYPE] IMPALA-9923: Test out super ugly hack to load ORC serially
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16292 Change subject: [PROTOTYPE] IMPALA-9923: Test out super ugly hack to load ORC serially .. [PROTOTYPE] IMPALA-9923: Test out super ugly hack to load ORC serially TODO: fill in with description Change-Id: I15eff1ec6cab32c1216ed7400e4c4b57bb81e4cd --- M bin/load-data.py 1 file changed, 10 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/16292/1 -- To view, visit http://gerrit.cloudera.org:8080/16292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I15eff1ec6cab32c1216ed7400e4c4b57bb81e4cd Gerrit-Change-Number: 16292 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell
[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16278 ) Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16278 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac Gerrit-Change-Number: 16278 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 05 Aug 2020 05:58:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10037: Remove flaky test mt dop scan node
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16286 ) Change subject: IMPALA-10037: Remove flaky test_mt_dop_scan_node .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6229/ -- To view, visit http://gerrit.cloudera.org:8080/16286 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08 Gerrit-Change-Number: 16286 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 05:29:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10047: Revert core piece of IMPALA-6984
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16288 ) Change subject: IMPALA-10047: Revert core piece of IMPALA-6984 .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8 Gerrit-Change-Number: 16288 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 03:40:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16252 ) Change subject: IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070 Gerrit-Change-Number: 16252 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 03:32:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10047: Revert core piece of IMPALA-6984
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16288 ) Change subject: IMPALA-10047: Revert core piece of IMPALA-6984 .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6793/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8 Gerrit-Change-Number: 16288 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 01:14:05 + Gerrit-HasComments: No
[Impala-ASF-CR] Add logging when query unregisters
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16285 ) Change subject: Add logging when query unregisters .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6228/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16285 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0 Gerrit-Change-Number: 16285 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 01:03:23 + Gerrit-HasComments: No
[Impala-ASF-CR] Add logging when query unregisters
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16285 ) Change subject: Add logging when query unregisters .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16285 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0 Gerrit-Change-Number: 16285 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 01:03:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10037: Remove flaky test mt dop scan node
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16286 ) Change subject: IMPALA-10037: Remove flaky test_mt_dop_scan_node .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6229/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16286 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08 Gerrit-Change-Number: 16286 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 01:03:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10037: Remove flaky test mt dop scan node
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16286 ) Change subject: IMPALA-10037: Remove flaky test_mt_dop_scan_node .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16286 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08 Gerrit-Change-Number: 16286 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 01:03:40 + Gerrit-HasComments: No
[Impala-ASF-CR] Add logging when query unregisters
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16285 ) Change subject: Add logging when query unregisters .. Patch Set 2: Code-Review+2 Carrying forward Tim's +2 -- To view, visit http://gerrit.cloudera.org:8080/16285 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0 Gerrit-Change-Number: 16285 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 01:02:17 + Gerrit-HasComments: No
[Impala-ASF-CR] Add logging when query unregisters
Hello Sahil Takiar, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16285 to look at the new patch set (#2). Change subject: Add logging when query unregisters .. Add logging when query unregisters This adds a log line which is printed when a query is successfully unregistered by the async unregister thread pool. Added only for additional observability. Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0 --- M be/src/service/impala-server.cc 1 file changed, 6 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/16285/2 -- To view, visit http://gerrit.cloudera.org:8080/16285 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0 Gerrit-Change-Number: 16285 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10037: Remove flaky test mt dop scan node
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16286 ) Change subject: IMPALA-10037: Remove flaky test_mt_dop_scan_node .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6792/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16286 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08 Gerrit-Change-Number: 16286 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 00:50:30 + Gerrit-HasComments: No
[Impala-ASF-CR] Add logging when query unregisteres
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16285 ) Change subject: Add logging when query unregisteres .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6791/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16285 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0 Gerrit-Change-Number: 16285 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 00:50:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10047: Revert core piece of IMPALA-6984
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16288 ) Change subject: IMPALA-10047: Revert core piece of IMPALA-6984 .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6227/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8 Gerrit-Change-Number: 16288 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 00:50:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10047: Revert core piece of IMPALA-6984
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16288 Change subject: IMPALA-10047: Revert core piece of IMPALA-6984 .. IMPALA-10047: Revert core piece of IMPALA-6984 Performance testing on TPC-DS found a peformance regression on short queries due to delayed exec status reports. Further testing traced this back to IMPALA-6984's behavior of cancelling backends on EOS. The coordinator log show that CancelBackends() call intermittently taking 10 seconds due to timing out in the RPC layer. As a temporary workaround, this reverts the core part of IMPALA-6984 that added that CancelBackends() call for EOS. It leaves the rest of IMPALA-6984 intact, as other code has built on top of it. Testing: - Core job - Performance tests Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8 (cherry picked from commit b91f3c0e064d592f3cdf2a2e089ca6546133ba55) --- M be/src/runtime/coordinator.cc 1 file changed, 1 insertion(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/16288/1 -- To view, visit http://gerrit.cloudera.org:8080/16288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8 Gerrit-Change-Number: 16288 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell
[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16278 ) Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6226/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16278 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac Gerrit-Change-Number: 16278 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 05 Aug 2020 00:44:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10037: Remove flaky test mt dop scan node
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16286 ) Change subject: IMPALA-10037: Remove flaky test_mt_dop_scan_node .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16286 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08 Gerrit-Change-Number: 16286 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 00:35:22 + Gerrit-HasComments: No
[Impala-ASF-CR] Add logging when query unregisteres
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16285 ) Change subject: Add logging when query unregisteres .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/16285/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16285/1//COMMIT_MSG@7 PS1, Line 7: Add logging when query unregisteres nit: unregisters -- To view, visit http://gerrit.cloudera.org:8080/16285 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0 Gerrit-Change-Number: 16285 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 00:33:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10037: Remove flaky test mt dop scan node
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16286 Change subject: IMPALA-10037: Remove flaky test_mt_dop_scan_node .. IMPALA-10037: Remove flaky test_mt_dop_scan_node This test has inherent flakiness due to it relying on instances fetching scan ranges from a shared queue. Therefore, this patch removes the test since it was just a sanity check but its flakiness outweighed its usefulness. Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08 --- M tests/query_test/test_mt_dop.py 1 file changed, 1 insertion(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/16286/1 -- To view, visit http://gerrit.cloudera.org:8080/16286 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08 Gerrit-Change-Number: 16286 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Add logging when query unregisteres
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16285 Change subject: Add logging when query unregisteres .. Add logging when query unregisteres This adds a log line which is printed when a query is successfully unregistered by the async unregister thread pool. Added only for additional observability. Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0 --- M be/src/service/impala-server.cc 1 file changed, 6 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/16285/1 -- To view, visit http://gerrit.cloudera.org:8080/16285 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0 Gerrit-Change-Number: 16285 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig
[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16280 ) Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload. .. Patch Set 2: For the variants that are added, could you add a comment about what the change from official version is. Also, we should add all the tpcds queries to the PlannerTest's tpcds-all.test such that we can track the Explains. Is there a separate JIRA for that ? -- To view, visit http://gerrit.cloudera.org:8080/16280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad Gerrit-Change-Number: 16280 Gerrit-PatchSet: 2 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 23:30:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16227 ) Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate .. IMPALA-9984: Implement codegen for TupleIsNullPredicate This commit implements proper codegen for TupleIsNullPredicate. Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e Reviewed-on: http://gerrit.cloudera.org:8080/16227 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/exprs/CMakeLists.txt A be/src/exprs/tuple-is-null-predicate-ir.cc M be/src/exprs/tuple-is-null-predicate.cc M be/src/exprs/tuple-is-null-predicate.h 6 files changed, 152 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e Gerrit-Change-Number: 16227 Gerrit-PatchSet: 6 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16227 ) Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e Gerrit-Change-Number: 16227 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 23:10:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16280 ) Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload. .. Patch Set 2: Code-Review+1 LGTM. I can +2 but wanted to give others a chnace to have a look. -- To view, visit http://gerrit.cloudera.org:8080/16280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad Gerrit-Change-Number: 16280 Gerrit-PatchSet: 2 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 23:00:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16263 ) Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries .. Patch Set 3: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py File docker/setup_build_context.py: http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py@87 PS3, Line 87: .py > I'm fine with excluding all the python files from this GCC directory. I don +1, if we're copying them into the container, it's a mistake -- To view, visit http://gerrit.cloudera.org:8080/16263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294 Gerrit-Change-Number: 16263 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 22:58:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9909: Print body of http error code in Impala Shell.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16269 ) Change subject: IMPALA-9909: Print body of http error code in Impala Shell. .. IMPALA-9909: Print body of http error code in Impala Shell. Make Impala Shell closer to Impyla by printing the body of any http error code message received when using hs2-over-http. The common case is that there is nothing in the body, in which case the behavior is unchanged. TESTING Added a test for the new functionality. Ran all end-to-end tests. Change-Id: Iabc45eda0b87ca694b8359148cda6a7c1d5a8fff Reviewed-on: http://gerrit.cloudera.org:8080/16269 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M shell/ImpalaHttpClient.py M tests/shell/test_shell_interactive.py 2 files changed, 90 insertions(+), 25 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16269 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iabc45eda0b87ca694b8359148cda6a7c1d5a8fff Gerrit-Change-Number: 16269 Gerrit-PatchSet: 6 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9909: Print body of http error code in Impala Shell.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16269 ) Change subject: IMPALA-9909: Print body of http error code in Impala Shell. .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16269 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabc45eda0b87ca694b8359148cda6a7c1d5a8fff Gerrit-Change-Number: 16269 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 04 Aug 2020 22:33:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16252 ) Change subject: IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6225/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070 Gerrit-Change-Number: 16252 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 22:32:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16252 ) Change subject: IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6790/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070 Gerrit-Change-Number: 16252 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 21:51:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/16263 ) Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries .. Patch Set 3: Code-Review+1 (2 comments) This makes sense to me. I can't think of any reason not to strip the debug symbols here, and it's great to save the space. http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py File docker/setup_build_context.py: http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py@87 PS3, Line 87: .py > Do we need to spell "-gdb.py" out here? I'm fine with excluding all the python files from this GCC directory. I don't expect us to need any. http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py@91 PS3, Line 91: check_call([STRIP, "--strip-debug", libstdcpp_so, "-o", : os.path.join(LIB_DIR, os.path.basename(libstdcpp_so))]) Nit: I think it would be good to factor out this strip call into a function similar to symlink_file_into_dir(). -- To view, visit http://gerrit.cloudera.org:8080/16263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294 Gerrit-Change-Number: 16263 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 21:51:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16278 ) Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6222/ -- To view, visit http://gerrit.cloudera.org:8080/16278 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac Gerrit-Change-Number: 16278 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 04 Aug 2020 21:41:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16252 to look at the new patch set (#2). Change subject: IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user .. IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user This patch fixes the integration between LDAP filters and proxy users by ensuring that the 'impala.doas.user' HS2 config option is considered when applying filters. This requires deferring checking the filters until the OpenSession() call. This patch also introduces new flags --ldap_bind_dn and --ldap_bind_password which must be specified in order to use LDAP filters, unless the LDAP server is set up to allow anonymous binds. These config options are modeled after equivalent options in Hue: https://github.com/cloudera/hue/blob/master/desktop/conf.dist/hue.ini#L425 Testing: - Added a test that uses the 'impala.doas.user' config with LDAP filters. Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070 --- M be/src/rpc/authentication.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/ldap-util.cc M be/src/util/ldap-util.h M be/src/util/webserver.cc M fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java M fe/src/test/java/org/apache/impala/customcluster/LdapImpalaShellTest.java M fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java M fe/src/test/java/org/apache/impala/testutil/LdapUtil.java 11 files changed, 200 insertions(+), 54 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/16252/2 -- To view, visit http://gerrit.cloudera.org:8080/16252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070 Gerrit-Change-Number: 16252 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. Patch Set 27: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6789/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 27 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 20:40:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Qifan Chen has uploaded a new patch set (#27). ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans This work addresses the current limitation in computing the total row count for a Hive table in a scan. The row count can be incorrectly computed as 0, even though there exists data in the Hive table. This is the stats corruption at table level. Similar stats corruption exists for a partition. The row count of a table or a partition sometime can also be -1 which indicates a missing stats situation. In the fix, as long as no partition in a Hive table exhibits any missing or corrupt stats, the total row count for the table is computed from the row counts in all partitions. Otherwise, Impala looks at the table level stats particularly the table row count. In addition, if the table stats is missing or corrupted, Impala estimates a row count for the table, if feasible. This row count is the sum of the row count from the partitions with good stats, and an estimation of the number of rows in the partitions with missing or corrupt stats. Such estimation also applies when some partition has missing or corrupt stats. One way to observe the fix is through the explain of queries scanning Hive tables with missing or corrupted stats. The cardinality for any full scan should be a positive value (i.e. the estimated row count), instead of 'unavailable'. At the beginning of the explain output, that table is still listed in the WARNING section for potentially corrupt table statistics. Testing: 1. Ran unit tests with queries documented in the case against Hive tables with the following configrations: a. No stats corruption in any partitions b. Stats corruption in some partitions c. Stats corruption in all partitions 2. Added two new tests in test_compute_stats.py: a. test_corrupted_stats_in_partitioned_Hive_tables b. test_corrupted_stats_in_unpartitioned_Hive_tables 3. Fixed failures in corrupt-stats.test 4. Ran "core" test Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 --- M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test M testdata/workloads/functional-planner/queries/PlannerTest/union.test M testdata/workloads/functional-query/queries/QueryTest/corrupt-stats.test M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test M tests/metadata/test_compute_stats.py M tests/metadata/test_explain.py 13 files changed, 236 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/16098/27 -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 27 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5022: Outer join simplification
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022: Outer join simplification .. Patch Set 5: (1 comment) Add one more comment. Thanks! http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3245 PS5, Line 3245: IS_OR_PREDICATE.appl If e is a conjunct, I think we also need to subject it to the intersection test. -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e Gerrit-Change-Number: 16266 Gerrit-PatchSet: 5 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Tue, 04 Aug 2020 19:47:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5022: Outer join simplification
Shant Hovsepian has posted comments on this change. ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022: Outer join simplification .. Patch Set 5: (10 comments) Hi Xianqing, thank you so much for this contribution! I'll need to do another pass and go over the tests but here are some initial comments. http://gerrit.cloudera.org:8080/#/c/16266/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16266/5//COMMIT_MSG@17 PS5, Line 17: one null rejecting condition on the inner table. Consider adding a query option like DISABLE_OUTER_TO_INNER_REWRITE so disable this optimization if needed as runtime. http://gerrit.cloudera.org:8080/#/c/16266/5//COMMIT_MSG@34 PS5, Line 34: * Ran the full set of verifications in Impala Public Jenkins Please try out TPC-DS Q49 there the LOJ queries in there should be rewritten. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3217 PS5, Line 3217: getWhereClauseConjuncts( Technically this would include having clause conjuncts as well, so might be misleading to name this function getWhereClauseConuncts. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3227 PS5, Line 3227: } As a further optimization you could use getEquivClassesOnTuples() to also check for null filtering conditions that come as a result of a transitive relationship. For example T1 LEFT OUTER JOIN T2 ON (T1.a = T2.a) JOIN T3 ON (T3.b=T2.b) WHERE T3.b > 10; http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3270 PS5, Line 3270: analyzeNoThrow > For some common SQL functions, we probably can directly test their existen Agreed it would be good to have some static expressions that we know won't reject nulls for example. col IS NULL col1 IS DISTINCT FROM col2 for things like IN and COALESCE you would recursively check the children. IF and CASE are trickier so you might want to call the BE or just skip those. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3427 PS5, Line 3427: // Recompute the graph since we may need to add value-transfer edges based on the See later comment in Planner, but it might be better to return this and have the caller recompute the graph. http://gerrit.cloudera.org:8080/#/c/16266/4/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/16266/4/fe/src/main/java/org/apache/impala/analysis/Expr.java@980 PS4, Line 980: his instanceof CompoundPredicate : && ((CompoundPredicate) this).getOp() == CompoundPredicate.Operator.OR You could use Expr.IS_OR_PREDICATE(this) here. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Expr.java@978 PS5, Line 978: public List getDisjunctiveConjuncts() { There is something off about this interface. You assume the caller the first time this is called has verified that the predicate is an OR. For example is someone called this function with just a plan Expr then it would return the Expr back. You might want to move the IS_OR_PREDICATE call from Analyzer.java#3245 into it's own wrapper function, which then calls this method. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@748 PS5, Line 748: // Transform outer join into inner join whenever possible Might want to use some state in the analyzer to check if any Outer Joins exist in the query and only then call this function then. For example globalstate_.outerJountTupleIds. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@749 PS5, Line 749: analyzer.simplifyOuterJoins(selectStmt.getTableRefs()); It would be if you returned some indicator that the value transfer graph needs to be recomputed. Then recompute the graph here so you can make the time line event accordingly. ctx_.getTimeline().markEvent("Recomputing value transfer graph") Also if the SingleNodePlanner's valueTransferGraphNeedsUpdate_ was set to true you could likely reset it after you recompute the graph. -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/setti
[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16227 ) Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e Gerrit-Change-Number: 16227 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 18:00:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16227 ) Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6224/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e Gerrit-Change-Number: 16227 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 18:00:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16227 ) Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e Gerrit-Change-Number: 16227 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 17:59:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16227 ) Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6788/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e Gerrit-Change-Number: 16227 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 17:58:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate
Daniel Becker has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/16227 ) Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate .. IMPALA-9984: Implement codegen for TupleIsNullPredicate This commit implements proper codegen for TupleIsNullPredicate. Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/exprs/CMakeLists.txt A be/src/exprs/tuple-is-null-predicate-ir.cc M be/src/exprs/tuple-is-null-predicate.cc M be/src/exprs/tuple-is-null-predicate.h 6 files changed, 152 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/16227/4 -- To view, visit http://gerrit.cloudera.org:8080/16227 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e Gerrit-Change-Number: 16227 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9909: Print body of http error code in Impala Shell.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16269 ) Change subject: IMPALA-9909: Print body of http error code in Impala Shell. .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6223/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16269 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabc45eda0b87ca694b8359148cda6a7c1d5a8fff Gerrit-Change-Number: 16269 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 04 Aug 2020 17:19:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9909: Print body of http error code in Impala Shell.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16269 ) Change subject: IMPALA-9909: Print body of http error code in Impala Shell. .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16269 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabc45eda0b87ca694b8359148cda6a7c1d5a8fff Gerrit-Change-Number: 16269 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 04 Aug 2020 17:19:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5022: Outer join simplification
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022: Outer join simplification .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6221/ -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e Gerrit-Change-Number: 16266 Gerrit-PatchSet: 5 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Tue, 04 Aug 2020 17:02:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9909: Print body of http error code in Impala Shell.
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16269 ) Change subject: IMPALA-9909: Print body of http error code in Impala Shell. .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16269 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabc45eda0b87ca694b8359148cda6a7c1d5a8fff Gerrit-Change-Number: 16269 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 04 Aug 2020 16:59:40 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for large read/write pages in GroupingAggregator
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16240 ) Change subject: WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for large read/write pages in GroupingAggregator .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/16240/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16240/5//COMMIT_MSG@17 PS5, Line 17: To be specific, we save extra reservation for writing a large page. It's I'll need to look in more detail but I think the overal approach makes sense. http://gerrit.cloudera.org:8080/#/c/16240/5//COMMIT_MSG@35 PS5, Line 35: This patch also fixes the wrong assumption that non-streaming Maybe I missed something when I initially did this, but I didn't think we need to be able to fit all the hash tables in memory because we could repartition until we can fit a single partition in memory. I think this change is probably fine anyway, to avoid repartitioning, because the increase in reservation is very small. -- To view, visit http://gerrit.cloudera.org:8080/16240 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d9c3a2e7f0da60071b920dec979729e86459775 Gerrit-Change-Number: 16240 Gerrit-PatchSet: 5 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 16:52:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 04 Aug 2020 16:46:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16278 ) Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6222/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16278 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac Gerrit-Change-Number: 16278 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 04 Aug 2020 16:38:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16280 ) Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload. .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6219/ -- To view, visit http://gerrit.cloudera.org:8080/16280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad Gerrit-Change-Number: 16280 Gerrit-PatchSet: 2 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 16:38:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16278 ) Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6787/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16278 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac Gerrit-Change-Number: 16278 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 04 Aug 2020 15:42:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16263 ) Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries .. Patch Set 3: (2 comments) Looks good to me. http://gerrit.cloudera.org:8080/#/c/16263/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16263/3//COMMIT_MSG@9 PS3, Line 9: so Just wonder if some other .so files in toolchain are worth the stripping effort. [11:30:03 qchen@qifan-10229: Impala] find . -name lib*so -exec file {} \; | grep "not stripped" ./toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/lib/libthriftz-0.11.0.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/lib/libthrift-0.11.0.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-asserts-p2/lib/clang/5.0.1/lib/linux/libclang_rt.asan-x86_64.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, with debug_info, not stripped ./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-asserts-p2/lib/clang/5.0.1/lib/linux/libclang_rt.dyndd-x86_64.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, with debug_info, not stripped ./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-asserts-p2/lib/clang/5.0.1/lib/linux/libclang_rt.ubsan_standalone-x86_64.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, with debug_info, not stripped ./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-asserts-p2/lib/clang/5.0.1/lib/linux/libclang_rt.ubsan_standalone_cxx-x86_64.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, with debug_info, not stripped ./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-p2/lib/clang/5.0.1/lib/linux/libclang_rt.asan-x86_64.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, with debug_info, not stripped ./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-p2/lib/clang/5.0.1/lib/linux/libclang_rt.dyndd-x86_64.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, with debug_info, not stripped ./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-p2/lib/clang/5.0.1/lib/linux/libclang_rt.ubsan_standalone-x86_64.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, with debug_info, not stripped ./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-p2/lib/clang/5.0.1/lib/linux/libclang_rt.ubsan_standalone_cxx-x86_64.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, with debug_info, not stripped ./toolchain/toolchain-packages-gcc7.5.0/gdb-7.9.1-p1/lib/libinproctrace.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libsureware.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libcswift.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/lib4758cca.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libaep.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libcapi.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libubsec.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libatalla.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libpadlock.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libnuron.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libchil.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libgmp.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped ./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libgost.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynami
[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16278 to look at the new patch set (#2). Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems .. IMPALA-10005: Fix Snappy decompression for non-block filesystems Snappy-compressed text always uses THdfsCompression::SNAPPY_BLOCKED type compression in the backend. However, for non-block filesystems, the frontend is incorrectly passing THdfsCompression::SNAPPY instead. On debug builds, this leads to a DCHECK when trying to read Snappy-compressed text. On release builds, it fails to decompress the data. This fixes the frontend to always pass THdfsCompression::SNAPPY_BLOCKED for Snappy-compressed text. This reworks query_test/test_compressed_formats.py to provide better coverage: - Changed the RC and Seq test cases to verify that the file extension doesn't matter. Added Avro to this case as well. - Fixed the text case to use appropriate extensions (fixing IMPALA-9004) - Changed the utility function so it doesn't use Hive. This allows it to be enabled on non-HDFS filesystems like S3. - Changed the test to use unique_database and allow parallel execution. - Changed the test to run in the core job, so it now has coverage on the usual S3 test configuration. It is reasonably quick (1-2 minutes) and runs in parallel. Testing: - Exhaustive job - Core s3 job - Changed the frontend to force it to use the code for non-block filesystems (i.e. the TFileSplitGeneratorSpec code) and verified that it is now able to read Snappy-compressed text. Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac --- M fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java M tests/query_test/test_compressed_formats.py 2 files changed, 132 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/16278/2 -- To view, visit http://gerrit.cloudera.org:8080/16278 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac Gerrit-Change-Number: 16278 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-5022: Outer join simplification
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022: Outer join simplification .. Patch Set 5: (14 comments) Thanks for the work. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3233 PS5, Line 3233: where clause Suggest to remove to make the comment more precise. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3234 PS5, Line 3234: . Suggest to add additional comment here to describe the use of the method, such as: This method identifies null-rejecting predicates which are the requirements to convert an outer-join to an inner join. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3240 PS5, Line 3240: contains nit. containing http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3242 PS5, Line 3242: t1.v1 you mean t2.v2? http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3254 PS5, Line 3254: intersect.isEmpty() For any disConjunct, when "ids intersect disConjunct != disConjunct", then disConjuncts should be skipped. The test here seems not sufficient. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3264 PS5, Line 3264: ULL input, eg, ISNULL(), IFNULL(), ZEROIFNULL(). We may need to reject UDFs as these functions can maintain a state which could allow the function to return different outputs for a given input. That is, we can not guarantee that such a UDF would not produce a NULL given a NULL input. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3270 PS5, Line 3270: analyzeNoThrow For some common SQL functions, we probably can directly test their existence and bypass the evaluation logic, assuming the evaluation during compile time is relatively expensive. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3278 PS5, Line 3278: if (!isTr It is a good idea to add a comment here. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3289 PS5, Line 3289: ex); We probably should return false here. http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3361 PS5, Line 3361: inner null-filling table http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3362 PS5, Line 3362: inner null-filling http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3364 PS5, Line 3364: inner null-filling http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3365 PS5, Line 3365: null filtering null-rejecting http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3375 PS5, Line 3375: case INNER_JOIN: { : break; Probably can be moved to the last 'default' section (of switch). -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e Gerrit-Change-Number: 16266 Gerrit-PatchSet: 5 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Tue, 04 Aug 2020 15:10:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 22: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6786/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 22 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 13:34:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#22). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The message is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. This new memory consumption statistics covers the following content: topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= : pool_stats::= List of host_stats_per_pool ::= : host_stats::= List of memory_consumption_statistics ::= | pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in that host. host_stats describes the memory consumption for all hosts in a pool and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB fraction_of_pool_total_mem=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is loggerd when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to simulate queries running in 4 pools in 3 hosts. This new test identifies the following: a. Top 5 queries among 4 pools in host 0; a. Top 5 queries among 4 pools in host 1; c. Top 5 queries among 3 hosts for a pool. 2. Core tests. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift 7 files changed, 828 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/22 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 22 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 17: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6785/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 17 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 04 Aug 2020 13:08:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 17: (4 comments) Hi anjalinorwood, thanks for your review! http://gerrit.cloudera.org:8080/#/c/16143/16//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16143/16//COMMIT_MSG@29 PS16, Line 29: We achieved this function by treating the iceberg table as normal > Are there plans to support read of Iceberg table as a partitioned table? Th Yes, you are right, partitioned table is useful for query plan. But we may not consider this in our first version, since it's hard to treat Iceberg table as an partitioned hdfs table. But we will definitely do this in next version, including: compute incremental stats/query plan optimization for iceberg and so on. This patch is just a simple version to scan iceberg table. http://gerrit.cloudera.org:8080/#/c/16143/16/fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java File fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java: http://gerrit.cloudera.org:8080/#/c/16143/16/fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java@80 PS16, Line 80: } > The double negative is pretty hard to parse. Done http://gerrit.cloudera.org:8080/#/c/16143/16/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/16143/16/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@139 PS16, Line 139: if (!(predicate.getChild(0) instanceof SlotRef)) return false; > Can the predicate be of the form: '10 = p1'? In that case, should there be yes, it is. I've already test by debug, sql like this: select * from table where 0=id can also pushdown predicate to iceberg. http://gerrit.cloudera.org:8080/#/c/16143/16/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/16143/16/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@114 PS16, Line 114: if ("PARQUET".equalsIgnoreCase(format)) return TIcebergFileFormat.PARQUET; > Rest of the code seems to support Iceberg ORC file format. This code does n I supported ORC format in original version, but when I test scan iceberg table with ORC, I found exception: https://issues.apache.org/jira/browse/IMPALA-9967 So I removed this, maybe supported more file format in other patch. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 17 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 04 Aug 2020 12:40:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Hello Zoltan Borok-Nagy, Anonymous Coward (606), Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16143 to look at the new patch set (#17). Change subject: IMPALA-9741: Support querying Iceberg table by impala .. IMPALA-9741: Support querying Iceberg table by impala This patch mainly realizes the querying of iceberg table through impala, we can use the following sql to create an external iceberg table: CREATE EXTERNAL TABLE default.iceberg_test ( level string, event_time timestamp, message string, ) STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); Or just including table name and location like this: CREATE EXTERNAL TABLE default.iceberg_test STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); 'iceberg_file_format' is the file format in iceberg, currently only support PARQUET, other format would be supported in the future. And if you don't specify this property in your SQL, default file format is PARQUET. We achieved this function by treating the iceberg table as normal unpartitioned hdfs table. When querying iceberg table, we pushdown partition column predicates to iceberg to decide which data files need to be scanned, and then transfer this information to BE to do the real scan operation. Testing: - Unit test for Iceberg in FileMetadataLoaderTest - Create table tests in functional_schema_template.sql - Iceberg table query test in test_scanners.py Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 --- M be/src/runtime/descriptors.cc M bin/rat_exclude_files.txt M common/thrift/CatalogObjects.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java M testdata/data/README A testdata/data/iceberg_test/iceberg_non_partitioned/data/1-100-e1a80ed6-1064-494d-9cdd-c4a30c1ab8dc-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/3-102-511427f2-85f0-43ae-9b39-a456f8dc57b6-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/4-103-00fc55e1-6ef7-4241-ace2-6d075b9737fc-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/6-105-ef9e76d5-c060-4040-8aa1-b7c275610daa-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/7-106-c09c9c8d-9478-44f9-8501-f85f53112bc3-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/9-108-3b4f06ac-dca3-4f4e-be60-bf42d9927b5b-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00011-110-1e653ccf-0963-4fb0-941c-32c9de13268b-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00012-111-dfa70658-eb4b-4fa0-9ffa-b892cf90d6ac-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00014-113-2d16e751-e2a4-4856-ab89-145996e3815e-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00015-114-0f710621-cbbf-4509-a93d-b58808978e2e-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00017-116-0b666c79-53df-4507-906c-542e65a83443-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00019-118-1bc6bc6e-e061-4da3-9d1e-a427a306c471-0.parquet A testdata/data/iceberg_test/iceberg_non_partitioned/data/00020-119-ae7b2c67-1538-4429-8246-4998960e3817-0.parquet A testdata/data/
[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function
Adam Tamas has posted comments on this change. ( http://gerrit.cloudera.org:8080/16283 ) Change subject: IMPALA-10018: Implement ds_kll_rank() function .. Patch Set 2: Code-Review+1 LGTM! -- To view, visit http://gerrit.cloudera.org:8080/16283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0 Gerrit-Change-Number: 16283 Gerrit-PatchSet: 2 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 04 Aug 2020 12:12:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5022: Outer join simplification
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022: Outer join simplification .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6784/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e Gerrit-Change-Number: 16266 Gerrit-PatchSet: 5 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Tue, 04 Aug 2020 12:10:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6783/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 04 Aug 2020 12:04:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5022: Outer join simplification
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022: Outer join simplification .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6221/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e Gerrit-Change-Number: 16266 Gerrit-PatchSet: 5 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Tue, 04 Aug 2020 11:51:25 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP
Xianqing He has abandoned this change. ( http://gerrit.cloudera.org:8080/15614 ) Change subject: WIP .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/15614 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I567bdcad0bcdbfeb539ed590e509533228cb528c Gerrit-Change-Number: 15614 Gerrit-PatchSet: 5 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6220/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 04 Aug 2020 11:44:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Hello Aman Sinha, Gabor Kaszab, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16228 to look at the new patch set (#6). Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) This implements scanning full ACID tables that contain complex types. The same technique works that we use for primitive types. I.e. we add a LEFT ANTI JOIN on top of the Hdfs scan node in order to subtract the deleted rows from the inserted rows. However, there were some types of queries where we couldn't do that. These are the queries that scan the nested collection items directly. E.g.: SELECT item FROM complextypestbl.int_array; The above query only creates a single tuple descriptor that holds the collection items. Since this tuple descriptor is not at the table-level, we cannot add slot references to the hidden ACID column which are at the top level of the table schema. To resolve this I added a statement rewriter that rewrites the above statement to the following: SELECT item FROM complextypestbl $a$1, $a$1.int_array; Now in this example we'll have two tuple descriptors, one for the table-level, and one for the collection item. So we can add the ACID slot refs to the table-level tuple descriptor. The rewrite is implemented by the new AcidRewriter class. Testing * Added planner tests to PlannerTest/acid-scans.test * E2E query tests to QueryTest/full-acid-complex-type-scans.test * E2E tests for rowid-generation: QueryTest/full-acid-rowid.test Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f --- M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test A testdata/workloads/functional-query/queries/QueryTest/full-acid-complex-type-scans.test M testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test M tests/query_test/test_acid.py 13 files changed, 923 insertions(+), 48 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16228/6 -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-5022: Outer join simplification
Xianqing He has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022: Outer join simplification .. IMPALA-5022: Outer join simplification As a general rule, an outer join can be converted to an inner join if there is a condition on the inner table that filters out non‑matching rows. In a left outer join, the right table is the inner table, while it is the left table in a right outer join. In a full outer join, both tables are inner tables. Conditions that are FALSE for nulls are referred to as null filtering conditions, and these are the conditions that enable the outer‑to‑inner join conversion to be made. An outer join can be converted to an inner join if the WHERE clause contains at least one null rejecting condition on the inner table. For example, 1. A LEFT JOIN B ON A.id = B.id WHERE B.v > 10 = A INNER JOIN B ON A.id = B.id WHERE B.v > 10 2. A RIGHT JOIN B ON A.id = B.id WHERE A.v > 10 = A INNER JOIN B ON A.id = B.id WHERE B.v > 10 3. A FULL JOIN B ON A.id = B.id WHERE A.v > 10 = A LEFT JOIN B ON A.id = B.id WHERE A.v > 10 4. A FULL JOIN B ON A.id = B.id WHERE B.v > 10 = A RIGHT JOIN B ON A.id = B.id WHERE B.v > 10 5. A FULL JOIN B ON A.id = B.id WHERE A.v > 10 AND B.v > 10 = A INNER JOIN B ON A.id = B.id WHERE A.v > 10 AND B.v > 10 Tests: * Update the baseline plan Tests * Add some plan tests in outer-joins.test * Ran the full set of verifications in Impala Public Jenkins Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test M testdata/workloads/functional-planner/queries/PlannerTest/card-outer-join.test M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test M testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-loop-join.test M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test M testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test 21 files changed, 1,544 insertions(+), 967 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/16266/5 -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e Gerrit-Change-Number: 16266 Gerrit-PatchSet: 5 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xianqing He
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6218/ -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 04 Aug 2020 11:35:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16280 ) Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload. .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6219/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad Gerrit-Change-Number: 16280 Gerrit-PatchSet: 2 Gerrit-Owner: Shant Hovsepian Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 11:32:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6782/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 04 Aug 2020 10:49:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6781/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 04 Aug 2020 10:42:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Hello Aman Sinha, Gabor Kaszab, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16228 to look at the new patch set (#5). Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) This implements scanning full ACID tables that contain complex types. The same technique works that we use for primitive types. I.e. we add a LEFT ANTI JOIN on top of the Hdfs scan node in order to subtract the deleted rows from the inserted rows. However, there were some types of queries where we couldn't do that. These are the queries that scan the nested collection items directly. E.g.: SELECT item FROM complextypestbl.int_array; The above query only creates a single tuple descriptor that holds the collection items. Since this tuple descriptor is not at the table-level, we cannot add slot references to the hidden ACID column which are at the top level of the table schema. To resolve this I added a statement rewriter that rewrites the above statement to the following: SELECT item FROM complextypestbl $a$1, $a$1.int_array; Now in this example we'll have two tuple descriptors, one for the table-level, and one for the collection item. So we can add the ACID slot refs to the table-level tuple descriptor. The rewrite is implemented by the new AcidRewriter class. Testing * Added planner tests to PlannerTest/acid-scans.test * E2E query tests to QueryTest/full-acid-complex-type-scans.test * E2E tests for rowid-generation: QueryTest/full-acid-rowid.test Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f --- M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test A testdata/workloads/functional-query/queries/QueryTest/full-acid-complex-type-scans.test M testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test M tests/query_test/test_acid.py 12 files changed, 922 insertions(+), 48 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16228/5 -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6218/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 04 Aug 2020 10:23:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. Patch Set 5: PS5 is a rebase. -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 04 Aug 2020 10:23:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. Patch Set 4: (6 comments) http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java: http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@385 PS2, Line 385: require > nit: typo Done http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java: http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1516 PS2, Line 1516: for (int i = 0; i < stmt.fromClause_.size(); ++i) { : TableRef tblRef = stmt.fromClause_.get(i); > nit: you can iterate over fromClause_.getTableRefs() and then you can use a splitCollectionRef() needs the index of the table ref. http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1541 PS2, Line 1541: int tableRefIdx > Instead of the index you can use the CollectionTableRef itself as a param. Yeah, I need the index to modify the FromClause. http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1556 PS2, Line 1556: Preconditions.checkSta > Could you add a comment what is at position '0' here? (I guess in L1553 it' Done http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1576 PS2, Line 1576: return rawTblPath; > Shouldn't this function belong to TableRef as a static member function? Done http://gerrit.cloudera.org:8080/#/c/16228/3/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java: http://gerrit.cloudera.org:8080/#/c/16228/3/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1508 PS3, Line 1508:* SELECT item FROM complextypestbl $a$1, $a$1.int_array; > I need to understand the current complex types support (independent of ACID Complex types are evaluated in a subplan, like the following: | 01:SUBPLAN | | row-size=16B cardinality=25.68K | | | |--04:NESTED LOOP JOIN [CROSS JOIN] | | | row-size=16B cardinality=10 | | | | | |--02:SINGULAR ROW SRC | | | row-size=12B cardinality=1 | | | | | 03:UNNEST [$a$1.int_array int_array] | | row-size=0B cardinality=10 | | | 00:SCAN HDFS [functional_orc_def.complextypestbl $a$1] |HDFS partitions=1/1 files=2 size=4.04KB |predicates: !empty($a$1.int_array) |row-size=12B cardinality=2.57K The left side of the SUBPLAN is the "input". The right side is the "subplan tree", it processes rows one-by-one from the "input". And the subplan will emit rows produced by the "subplan tree". So in this case the right side's SINGULAR ROW SRC node and UNNEST node will be fed by (nested) rows coming from SCAN HDFS. UNNEST will create a row for each collection item, SINGULAR ROW SRC just holds the current row, and the NESTED LOOP JOIN will produce the unnested/flat rows. So it won't do huge CROSS JOINs, but yeah, this rewrite definitely adds some overhead. But only to some type of queries, i.e. queries that only refer to the items of a collection. I think the majority of complex type queries are not like that, so they won't be affected. -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 04 Aug 2020 10:17:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Hello Aman Sinha, Gabor Kaszab, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16228 to look at the new patch set (#4). Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) This implements scanning full ACID tables that contain complex types. The same technique works that we use for primitive types. I.e. we add a LEFT ANTI JOIN on top of the Hdfs scan node in order to subtract the deleted rows from the inserted rows. However, there were some types of queries where we couldn't do that. These are the queries that scan the nested collection items directly. E.g.: SELECT item FROM complextypestbl.int_array; The above query only creates a single tuple descriptor that holds the collection items. Since this tuple descriptor is not at the table-level, we cannot add slot references to the hidden ACID column which are at the top level of the table schema. To resolve this I added a statement rewriter that rewrites the above statement to the following: SELECT item FROM complextypestbl $a$1, $a$1.int_array; Now in this example we'll have two tuple descriptors, one for the table-level, and one for the collection item. So we can add the ACID slot refs to the table-level tuple descriptor. The rewrite is implemented by the new AcidRewriter class. Testing * Added planner tests to PlannerTest/acid-scans.test * E2E query tests to QueryTest/full-acid-complex-type-scans.test * E2E tests for rowid-generation: QueryTest/full-acid-rowid.test Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f --- M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test A testdata/workloads/functional-query/queries/QueryTest/full-acid-complex-type-scans.test M testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test M tests/query_test/test_acid.py 12 files changed, 924 insertions(+), 48 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16228/4 -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16283 ) Change subject: IMPALA-10018: Implement ds_kll_rank() function .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6780/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0 Gerrit-Change-Number: 16283 Gerrit-PatchSet: 2 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 04 Aug 2020 10:05:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9963: Implement ds kll n() function
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16259 ) Change subject: IMPALA-9963: Implement ds_kll_n() function .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6779/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781 Gerrit-Change-Number: 16259 Gerrit-PatchSet: 4 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 04 Aug 2020 10:05:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16283 ) Change subject: IMPALA-10018: Implement ds_kll_rank() function .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/16283/1/be/src/exprs/datasketches-functions.h File be/src/exprs/datasketches-functions.h: http://gerrit.cloudera.org:8080/#/c/16283/1/be/src/exprs/datasketches-functions.h@47 PS1, Line 47: not, the > nit: missing comma Thx, done. -- To view, visit http://gerrit.cloudera.org:8080/16283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0 Gerrit-Change-Number: 16283 Gerrit-PatchSet: 2 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 04 Aug 2020 09:37:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function
Hello Adam Tamas, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16283 to look at the new patch set (#2). Change subject: IMPALA-10018: Implement ds_kll_rank() function .. IMPALA-10018: Implement ds_kll_rank() function ds_kll_rank() receives two parameters: a STRING that represents a serialized DataSketches KLL sketch and a float to provide a probing value in the sketch. Returns a DOUBLE that is the rank of the given probing value in the range of [0,1]. E.g. a return value of 0.2 means that the probing value given as parameter is greater than the 20% of all the values in the sketch. Note, this is an approximate calculation. Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0 --- M be/src/exprs/datasketches-functions-ir.cc M be/src/exprs/datasketches-functions.h M common/function-registry/impala_functions.py M testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test 4 files changed, 76 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/16283/2 -- To view, visit http://gerrit.cloudera.org:8080/16283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0 Gerrit-Change-Number: 16283 Gerrit-PatchSet: 2 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9963: Implement ds kll n() function
Hello Adam Tamas, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16259 to look at the new patch set (#4). Change subject: IMPALA-9963: Implement ds_kll_n() function .. IMPALA-9963: Implement ds_kll_n() function This function receives a serialized Apache DataSketches KLL sketch and returns how many input values were fed into this sketch. Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781 --- M be/src/exprs/datasketches-common.h M be/src/exprs/datasketches-functions-ir.cc M be/src/exprs/datasketches-functions.h M common/function-registry/impala_functions.py M testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test 5 files changed, 56 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/16259/4 -- To view, visit http://gerrit.cloudera.org:8080/16259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781 Gerrit-Change-Number: 16259 Gerrit-PatchSet: 4 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function
Adam Tamas has posted comments on this change. ( http://gerrit.cloudera.org:8080/16283 ) Change subject: IMPALA-10018: Implement ds_kll_rank() function .. Patch Set 1: (1 comment) Hi Gabor, Thank you for the good work with the KLL functions. Apart from a general nit, it looks good to me. http://gerrit.cloudera.org:8080/#/c/16283/1/be/src/exprs/datasketches-functions.h File be/src/exprs/datasketches-functions.h: http://gerrit.cloudera.org:8080/#/c/16283/1/be/src/exprs/datasketches-functions.h@47 PS1, Line 47: not then nit: missing comma As far as I see, it is missing in every comment where this sentence is used. -- To view, visit http://gerrit.cloudera.org:8080/16283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0 Gerrit-Change-Number: 16283 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 04 Aug 2020 08:39:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16283 ) Change subject: IMPALA-10018: Implement ds_kll_rank() function .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6778/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0 Gerrit-Change-Number: 16283 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 04 Aug 2020 08:19:13 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for large read/write pages in GroupingAggregator
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16240 ) Change subject: WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for large read/write pages in GroupingAggregator .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6777/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16240 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d9c3a2e7f0da60071b920dec979729e86459775 Gerrit-Change-Number: 16240 Gerrit-PatchSet: 5 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 08:15:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function
Gabor Kaszab has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16283 Change subject: IMPALA-10018: Implement ds_kll_rank() function .. IMPALA-10018: Implement ds_kll_rank() function ds_kll_rank() receives two parameters: a STRING that represents a serialized DataSketches KLL sketch and a float to provide a probing value in the sketch. Returns a DOUBLE that is the rank of the given probing value in the range of [0,1]. E.g. a return value of 0.2 means that the probing value given as parameter is greater than the 20% of all the values in the sketch. Note, this is an approximate calculation. Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0 --- M be/src/exprs/datasketches-functions-ir.cc M be/src/exprs/datasketches-functions.h M common/function-registry/impala_functions.py M testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test 4 files changed, 71 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/16283/1 -- To view, visit http://gerrit.cloudera.org:8080/16283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0 Gerrit-Change-Number: 16283 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab
[Impala-ASF-CR] WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for large read/write pages in GroupingAggregator
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16240 to look at the new patch set (#5). Change subject: WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for large read/write pages in GroupingAggregator .. WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for large read/write pages in GroupingAggregator The minimum requirement for a spillable operator is ((min_buffers -2) * default_buffer_size) + 2 * max_row_size. In the min reservation, we only reserve space for two large pages, one for reading, the other for writing. However, to make the non-streaming GroupingAggregator work correctly, we have to manage these extra reservations carefully. So it won't run out of the min reservation when it actually needs to spill a large page, or when it actually needs to read a large page. To be specific, we save extra reservation for writing a large page. It's only used when we run out of unused reservation and fail to increase the reservation to fit the large page. Currently there are two cases in non-streaming GroupingAggregator. One case is when we start to spill a partition and a serialize stream is needed to write some large pages. The other case is when we have spilled all partitions in a repartition process and need to write a large page to a spilled partition. Note that each spilled partition in the repartition process still keeps the default_page_size worth of reservation for writing a default page. We can only restore the extra reservation when a partition is actually writing a large page, and then reclaim it after the writing. The same for extra reservation for reading a large page. In the repartition process, we may read large pages from the input stream (from a previous spilled partition). When it needs to pin the current large page, we restore the extra reservation, and then reclaim it when the attached row batch is reset. This patch also fixes the wrong assumption that non-streaming GroupingAggregator only requires one buffer reservation for the hash tables. The minimal spillable buffer size is 64KB, while the minimal requirement of a non-streaming GroupingAggregator's hash tables is num_buckets(1024) * bucket_size(16) * partition_fanout(16) = 256KB. We should reserve more buffers when the spillable buffer size is small. Fix some planner test failures due to this change. Tests: - Add tests in test_spilling.py to verify GroupingAggregator works in min reservation. Change-Id: I3d9c3a2e7f0da60071b920dec979729e86459775 --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/grouping-aggregator-ir.cc M be/src/exec/grouping-aggregator-partition.cc M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/runtime/buffered-tuple-stream.cc M be/src/runtime/buffered-tuple-stream.h M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/bufferpool/reservation-tracker.cc M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test M testdata/workloads/functional-query/queries/QueryTest/spilling-large-rows.test 17 files changed, 486 insertions(+), 176 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16240/5 -- To view, visit http://gerrit.cloudera.org:8080/16240 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3d9c3a2e7f0da60071b920dec979729e86459775 Gerrit-Change-Number: 16240 Gerrit-PatchSet: 5 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5022: Outer join simplification
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022: Outer join simplification .. Patch Set 4: Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6217/ -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e Gerrit-Change-Number: 16266 Gerrit-PatchSet: 4 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Tue, 04 Aug 2020 07:44:51 + Gerrit-HasComments: No