[Impala-ASF-CR] IMPALA-8304: Generate JUnitXML if a command run by CMake fails
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12668 ) Change subject: IMPALA-8304: Generate JUnitXML if a command run by CMake fails .. Patch Set 6: Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6393/ -- To view, visit http://gerrit.cloudera.org:8080/12668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If71f2faf3ab5052b56b38f1b291fee53c390ce23 Gerrit-Change-Number: 12668 Gerrit-PatchSet: 6 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 03 Sep 2020 04:34:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10129 Data race in MemTracker::GetTopNQueriesAndUpdatePoolStats
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16408 ) Change subject: IMPALA-10129 Data race in MemTracker::GetTopNQueriesAndUpdatePoolStats .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7082/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16408 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c4ffe8064d3e099a525cc48c218ef73112fb67b Gerrit-Change-Number: 16408 Gerrit-PatchSet: 3 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Thu, 03 Sep 2020 04:18:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10129 Data race in MemTracker::GetTopNQueriesAndUpdatePoolStats
Qifan Chen has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16408 Change subject: IMPALA-10129 Data race in MemTracker::GetTopNQueriesAndUpdatePoolStats .. IMPALA-10129 Data race in MemTracker::GetTopNQueriesAndUpdatePoolStats This work addresses the data race by properly initializing two data member is_query_mem_tracker_ and query_id_ in a constructor for the MemTracker class. Without doing so, the two data members are set after the object is constructed. This creates a race condition for other threads to modify either of them at the same time. Testing: 1. Ran the python admission controller test successfully with a tsan build. Data race was not observed with the enhancement. Data race was observed without the enhancement. 2. Ran the core test. Change-Id: I9c4ffe8064d3e099a525cc48c218ef73112fb67b --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h 2 files changed, 8 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/16408/3 -- To view, visit http://gerrit.cloudera.org:8080/16408 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9c4ffe8064d3e099a525cc48c218ef73112fb67b Gerrit-Change-Number: 16408 Gerrit-PatchSet: 3 Gerrit-Owner: Qifan Chen
[Impala-ASF-CR] IMPALA-10116: Allow unwrapping a builtin cast function similar to CastExpr
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16407 ) Change subject: IMPALA-10116: Allow unwrapping a builtin cast function similar to CastExpr .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7081/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16407 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idf82b2de78c6a7051ea036062f177d69e2558940 Gerrit-Change-Number: 16407 Gerrit-PatchSet: 1 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 03 Sep 2020 01:56:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10116: Allow unwrapping a builtin cast function similar to CastExpr
Aman Sinha has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16407 Change subject: IMPALA-10116: Allow unwrapping a builtin cast function similar to CastExpr .. IMPALA-10116: Allow unwrapping a builtin cast function similar to CastExpr This change allows unwrapping a builtin cast function such as casttobigint(col) similar to a CAST(col as bigint). Unwrapping is useful to access the SlotRef of the column and this in turn is needed to compute predicate selectivity correctly. Without unwrapping, the cast function uses default 10 % selectivity for a predicate such as 'casttobigint(l_quantity) is NOT NULL' which is not accurate. Note that Impala does not allow a user query to directly call the builtin cast function..rather they have to use the explicit CAST syntax. However, since the frontend jar can be used by an external frontend module as a library, the builtin function can be called and this patch makes the behavior consistent. Testing: - Ran PlannerTest - Manual testing by commenting out the code in FunctionCallExpr.analyzeImpl() that throws an AnalysisException if builtin cast function is called. I haven't added a new test for this reason. Cardinality before this change: explain select * from date_dim d1, date_dim d2 where d1.d_week_seq = d2.d_week_seq - 52 and casttobigint(d1.d_week_seq) is not null and casttobigint(d2.d_week_seq) is not null SCAN HDFS [tpcds.date_dim d1] HDFS partitions=1/1 files=1 size=9.84MB predicates: casttobigint(d1.d_week_seq) IS NOT NULL runtime filters: RF000 -> d1.d_week_seq row-size=255B cardinality=7.30K Cardinality after this change: SCAN HDFS [tpcds.date_dim d1] HDFS partitions=1/1 files=1 size=9.84MB predicates: casttobigint(d1.d_week_seq) IS NOT NULL runtime filters: RF000 -> d1.d_week_seq row-size=255B cardinality=73.05K Change-Id: Idf82b2de78c6a7051ea036062f177d69e2558940 --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java 2 files changed, 8 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/16407/1 -- To view, visit http://gerrit.cloudera.org:8080/16407 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Idf82b2de78c6a7051ea036062f177d69e2558940 Gerrit-Change-Number: 16407 Gerrit-PatchSet: 1 Gerrit-Owner: Aman Sinha
[Impala-ASF-CR] WIP IMPALA-9229: impala-shell 'profile' to show original and retried queries
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16406 ) Change subject: WIP IMPALA-9229: impala-shell 'profile' to show original and retried queries .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7080/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16406 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89cee02947b311e7bf9c7274f47dfc7214c1bb65 Gerrit-Change-Number: 16406 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 03 Sep 2020 01:10:08 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9229: impala-shell 'profile' to show original and retried queries
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16406 ) Change subject: WIP IMPALA-9229: impala-shell 'profile' to show original and retried queries .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/16406/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/16406/1/be/src/service/impala-server.cc@701 PS1, Line 701: Status status = GetAllQueryHandles(query_id, &active_query_handle, &original_query_handle, line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/16406/1/tests/shell/util.py File tests/shell/util.py: http://gerrit.cloudera.org:8080/#/c/16406/1/tests/shell/util.py@327 PS1, Line 327: def wait_for_query_state(vector, stmt, state, max_retry=15): flake8: E302 expected 2 blank lines, found 1 -- To view, visit http://gerrit.cloudera.org:8080/16406 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89cee02947b311e7bf9c7274f47dfc7214c1bb65 Gerrit-Change-Number: 16406 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 03 Sep 2020 00:49:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-9229: impala-shell 'profile' to show original and retried queries
Sahil Takiar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16406 Change subject: WIP IMPALA-9229: impala-shell 'profile' to show original and retried queries .. WIP IMPALA-9229: impala-shell 'profile' to show original and retried queries * Modifies TGetRuntimeProfileReq and TGetRuntimeProfileResp; adds a new option to TGetRuntimeProfileResp called include_query_attempts * When include_query_attempts = true, the TGetRuntimeProfileResp will include the runtime profiles of all failed query attempts * impala-shell has been modified to dump both the most recent query attempt and the retried query attempt * Support for this has only been added to HS2 in order to keeps things simpler; given that Beeswax is being deprecated soon, it is probably worth adding Beeswax support * Most of the code change is in impala-hs2-server and impala-server; I had to re-factor a lot of the code related to the GetRuntimeProfile method Testing: * Added new tests * Almost all core tests are passing Change-Id: I89cee02947b311e7bf9c7274f47dfc7214c1bb65 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M common/thrift/ImpalaService.thrift M shell/impala_client.py M shell/impala_shell.py M tests/custom_cluster/test_shell_interactive.py M tests/shell/test_shell_commandline.py M tests/shell/util.py 13 files changed, 405 insertions(+), 122 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/16406/1 -- To view, visit http://gerrit.cloudera.org:8080/16406 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I89cee02947b311e7bf9c7274f47dfc7214c1bb65 Gerrit-Change-Number: 16406 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar
[Impala-ASF-CR] IMPALA-8304: Generate JUnitXML if a command run by CMake fails
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12668 ) Change subject: IMPALA-8304: Generate JUnitXML if a command run by CMake fails .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6393/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/12668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If71f2faf3ab5052b56b38f1b291fee53c390ce23 Gerrit-Change-Number: 12668 Gerrit-PatchSet: 6 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 02 Sep 2020 23:22:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 15: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 15 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 22:57:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. IMPALA-10064: Support constant propagation for eligible range predicates This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' AND date_col = CAST(timestamp_col as DATE) As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Reviewed-on: http://gerrit.cloudera.org:8080/16346 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M testdata/bin/compute-table-stats.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test A testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test M tests/query_test/test_queries.py 11 files changed, 412 insertions(+), 31 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 16 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10106: Upgrade DataSketches to version 2.1.0
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16360 ) Change subject: IMPALA-10106: Upgrade DataSketches to version 2.1.0 .. IMPALA-10106: Upgrade DataSketches to version 2.1.0 Upgrade the external DataSketches files for HLL/KLL to version 2.1.0 tests: -Ran the tests from tests/query_test/test_datasketches.py Change-Id: I4faa31c0b628a62c7e56a6c4b9549d0aaa8a02ff Reviewed-on: http://gerrit.cloudera.org:8080/16360 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/thirdparty/datasketches/README.md M be/src/thirdparty/datasketches/kll_quantile_calculator.hpp M be/src/thirdparty/datasketches/kll_quantile_calculator_impl.hpp M be/src/thirdparty/datasketches/kll_sketch_impl.hpp 4 files changed, 106 insertions(+), 120 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I4faa31c0b628a62c7e56a6c4b9549d0aaa8a02ff Gerrit-Change-Number: 16360 Gerrit-PatchSet: 8 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10106: Upgrade DataSketches to version 2.1.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16360 ) Change subject: IMPALA-10106: Upgrade DataSketches to version 2.1.0 .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4faa31c0b628a62c7e56a6c4b9549d0aaa8a02ff Gerrit-Change-Number: 16360 Gerrit-PatchSet: 7 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 22:06:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10124 admission-controller-test fails with no such file or directory error
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16404 ) Change subject: IMPALA-10124 admission-controller-test fails with no such file or directory error .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7079/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I16d6cff8fad8d0e93a24ec3fefa9cc1f8c471aad Gerrit-Change-Number: 16404 Gerrit-PatchSet: 1 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 02 Sep 2020 19:42:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10124 admission-controller-test fails with no such file or directory error
Qifan Chen has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16404 Change subject: IMPALA-10124 admission-controller-test fails with no such file or directory error .. IMPALA-10124 admission-controller-test fails with no such file or directory error This work addresses a failure by disabling undefined behavior sanitizer testing for AdmissionControllerTest.TopNQueryCheck test. In the test, std::regex_match() is used to verify the appearance of certain strings and can produce a core with very long stack trace failling in std::vector::operator[](). Testing: 1. Ran the test in both regular and disabling undefined behavior sanitizer check modes. No core is seen. Change-Id: I16d6cff8fad8d0e93a24ec3fefa9cc1f8c471aad --- M be/src/scheduling/admission-controller-test.cc 1 file changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/16404/1 -- To view, visit http://gerrit.cloudera.org:8080/16404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I16d6cff8fad8d0e93a24ec3fefa9cc1f8c471aad Gerrit-Change-Number: 16404 Gerrit-PatchSet: 1 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-4065 Inline comparator calls into TopN::InsertBatch()
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16373 ) Change subject: IMPALA-4065 Inline comparator calls into TopN::InsertBatch() .. Patch Set 9: (24 comments) Had a bunch of style comments, mostly driven by our style guide and existing best practices. I expected to see some changes to the codegen logic or perf numbers to actually replace the indirect comparator calls in the LLVM IR with a direct call to the codegen'd function. Is there going to be a part 2 that does that? http://gerrit.cloudera.org:8080/#/c/16373/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16373/9//COMMIT_MSG@19 PS9, Line 19: Did you do any performance comparison? Would be nice to have a targeted query that stresses the operator. http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/exec/topn-node-ir.cc File be/src/exec/topn-node-ir.cc: http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/exec/topn-node-ir.cc@58 PS9, Line 58: priority_queue_.Heapify(); I'm a little confused by this - isn't heapify generally an O(n) operation? http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/exec/topn-node.h File be/src/exec/topn-node.h: http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/exec/topn-node.h@188 PS9, Line 188: private : nit: bad whitespace change http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/exec/topn-node.cc File be/src/exec/topn-node.cc: http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/exec/topn-node.cc@128 PS9, Line 128: if (codegen_status.ok()) { I would have expected to see some changes here to replace the comparator call with a codegen'd version. http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/exec/topn-node.cc@273 PS9, Line 273: priority_queue_.Pop(tuple); This interface is an improvement! http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/exec/topn-node.cc@312 PS9, Line 312: : capacity_(capacity), priority_queue_(c, capacity) {} Not the biggest deal, but this preallocates the full array whereas previously it relied on the vector to expand. Might be a small difference in memory management for larger values of N. http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/CMakeLists.txt File be/src/util/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/CMakeLists.txt@151 PS9, Line 151: priority-queue-test.cc nit: can you preserve the alphabetic ordering of the files here? http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/CMakeLists.txt@221 PS9, Line 221: ADD_UNIFIED_BE_LSAN_TEST(priority-queue-test "PriorityQueueTest.*") Could preserve order here too http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/comparator-wrapper.h File be/src/util/comparator-wrapper.h: PS9: I think we can maybe just delete this wrapper - it was only required to adapt TupleRowComparator to the C++ STL interface, and I think we can now just call it directly. http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/comparator-wrapper.h@19 PS9, Line 19: #ifndef IMPALA_UTIL_COMPARATOR_WRAPPER_H_ Not a big deal, but we prefer #pragma once in new code instead of traditional include guards. https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868536 http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/priority-queue-test.cc File be/src/util/priority-queue-test.cc: http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/priority-queue-test.cc@33 PS9, Line 33: testInt nit: TestInt http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/priority-queue.h File be/src/util/priority-queue.h: http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/priority-queue.h@19 PS9, Line 19: #define IMPALA_UTIL_PRIORITY_QUEUE_H We prefer #pragma once http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/priority-queue.h@81 PS9, Line 81: PriorityQueue(ComparatorWrapper c, std::size_t n = 100) Can't we pass in TupleRowComparator directly? ComparatorWrapper was just used to adapt TupleRowComparator to the c++ stl interface but I don't think we need that here. http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/priority-queue.h@96 PS9, Line 96: inline T& Top() { return elements_[0]; } nit: not a big deal, but I generally recommend omitting inline from functions defined in the class body cause it's already implied. https://en.cppreference.com/w/cpp/language/inline "A function defined entirely inside a class/struct/union definition, whether it's a member function or a non-member friend function, is implicitly an inline function. " http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/priority-queue.h@105 PS9, Line 105: capacity_ *= 2; I'd prefer to use a vector<> so that we don't need to reimplement the doubling behaviour. http://gerrit.cloudera.org:8080/#/c/16373/9/be/src/util/priority-queue.h@122 PS9, Line 122: void Pop(T& v) { Why not just return the value? It's just a pointer so copy vs move etc doesn't matter.
[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16314 ) Change subject: IMPALA-7658: Proper codegen for HiveUdfCall .. Patch Set 13: (2 comments) LGTM pending these 2 comments http://gerrit.cloudera.org:8080/#/c/16314/13/be/src/exprs/scalar-expr-evaluator-ir.cc File be/src/exprs/scalar-expr-evaluator-ir.cc: PS13: This file doesn't seem to be referenced anywhere. Did you mean to add it? http://gerrit.cloudera.org:8080/#/c/16314/13/testdata/workloads/functional-query/queries/QueryTest/java-udf.test File testdata/workloads/functional-query/queries/QueryTest/java-udf.test: http://gerrit.cloudera.org:8080/#/c/16314/13/testdata/workloads/functional-query/queries/QueryTest/java-udf.test@321 PS13, Line 321: ScaparExprEvaluator ScalarExprEvaluator -- To view, visit http://gerrit.cloudera.org:8080/16314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747 Gerrit-Change-Number: 16314 Gerrit-PatchSet: 13 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 17:52:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 15: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6392/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 15 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 17:44:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 15: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 15 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 17:44:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 14: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 14 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 17:43:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5022 part 1/2: Outer join simplification
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022 part 1/2: Outer join simplification .. Patch Set 19: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e Gerrit-Change-Number: 16266 Gerrit-PatchSet: 19 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Wed, 02 Sep 2020 17:36:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7078/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 14 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 17:25:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 14: > Patch Set 13: > > > Patch Set 12: > > > > > Patch Set 12: Verified-1 > > > > > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/ > > > > 1 failure in org.apache.impala.planner.PlannerTest.testConstantPropagation: > > (this contains new tests added for this CR): seems related to ordering of > > the predicates and stats difference. The test passes on my dev machine. > > > > Actual does not match expected result: > > PLAN-ROOT SINK > > | > > 00:SCAN HDFS [functional.alltypes_date_partition] > >partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE > > '2009-02-01' > >HDFS partitions=32/55 files=32 size=15.99KB > >predicates: functional.alltypes_date_partition.int_col < 100, > > functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 > > 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP > > '2009-01-01 00:00:00', functional.alltypes_date_partition.bigint_col IN (5, > > 10), date_col = CAST(timestamp_col AS DATE) > > ^^^ > >row-size=64B cardinality=26 > > > > Expected: > > PLAN-ROOT SINK > > | > > 00:SCAN HDFS [functional.alltypes_date_partition] > >partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE > > '2009-02-01' > >HDFS partitions=32/55 files=32 size=15.99KB > >predicates: functional.alltypes_date_partition.bigint_col IN (5, 10), > > functional.alltypes_date_partition.int_col < 100, > > functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 > > 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP > > '2009-01-01 00:00:00', date_col = CAST(timestamp_col AS DATE) > >row-size=65B cardinality=13 > > > > There's also 1 failure in > > org.apache.impala.service.FrontendTest.TestGetTablesTypeTable which I > > didn't quite understand yet. > > 1st failure was due to stats not present. On my local machine I think I had > run the compute stats manually. I have added the alltypes_date_partition to > the compute-table-stats.sh script. Second failure needed a minor update to > the FrontendTest.java. There's some discrepancy in the Jenkins run vs my local one..it seems the table 'functional.alltypes_datasource' was never loaded on my local machine and I am not sure how to load it. In any case, I have made an update to the FrontendTest.java in PatchSet14 that should hopefully work now. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 14 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 17:13:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16346 to look at the new patch set (#14). Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. IMPALA-10064: Support constant propagation for eligible range predicates This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' AND date_col = CAST(timestamp_col as DATE) As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M testdata/bin/compute-table-stats.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test A testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test M tests/query_test/test_queries.py 11 files changed, 412 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/14 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 14 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7077/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 13 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 16:54:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10106: Upgrade DataSketches to version 2.1.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16360 ) Change subject: IMPALA-10106: Upgrade DataSketches to version 2.1.0 .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6391/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4faa31c0b628a62c7e56a6c4b9549d0aaa8a02ff Gerrit-Change-Number: 16360 Gerrit-PatchSet: 7 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 16:52:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10106: Upgrade DataSketches to version 2.1.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16360 ) Change subject: IMPALA-10106: Upgrade DataSketches to version 2.1.0 .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4faa31c0b628a62c7e56a6c4b9549d0aaa8a02ff Gerrit-Change-Number: 16360 Gerrit-PatchSet: 7 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 16:52:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9636: Don't run retried query on the blacklisted nodes
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16369 ) Change subject: IMPALA-9636: Don't run retried query on the blacklisted nodes .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/16369/2/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16369/2/be/src/scheduling/admission-controller.cc@1218 PS2, Line 1218: const ExecutorGroup* schedule_executor_group = : ExecutorGroup::GetOrCreateExecutorGroup( : executor_group, request.blacklisted_backend_ids); > RemoveExecutors() use ip_address to find executor with map. For the new fun perhaps a simpler way would be for the blacklisted_backend_ids to be an unordered_set of 'const NetworkAddressPB&' instead. then you could use a combination of ExecutorGroup::LookUpBackendDesc and ExecutorGroup::RemoveExecutor to remove executors. this is probably a faster approach as well since its all hash-based lookups and avoids iterating over all the full list of executors (looks like GetOrCreateExecutorGroup already iterates over all executors twice). the other advantage is that it keeps the code simpler since we can now rely on existing methods. http://gerrit.cloudera.org:8080/#/c/16369/2/be/src/scheduling/admission-controller.cc@1232 PS2, Line 1232: output_schedules->emplace_back(std::move(group_state), *executor_group); is there any concern with using executor_group here vs. scheduled_executor_group? -- To view, visit http://gerrit.cloudera.org:8080/16369 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I00bc1b5026efbd0670ffbe57bcebc457d34cb105 Gerrit-Change-Number: 16369 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 02 Sep 2020 16:39:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 13: > Patch Set 12: > > > Patch Set 12: Verified-1 > > > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/ > > 1 failure in org.apache.impala.planner.PlannerTest.testConstantPropagation: > (this contains new tests added for this CR): seems related to ordering of > the predicates and stats difference. The test passes on my dev machine. > > Actual does not match expected result: > PLAN-ROOT SINK > | > 00:SCAN HDFS [functional.alltypes_date_partition] >partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE > '2009-02-01' >HDFS partitions=32/55 files=32 size=15.99KB >predicates: functional.alltypes_date_partition.int_col < 100, > functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 > 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP > '2009-01-01 00:00:00', functional.alltypes_date_partition.bigint_col IN (5, > 10), date_col = CAST(timestamp_col AS DATE) > ^^^ >row-size=64B cardinality=26 > > Expected: > PLAN-ROOT SINK > | > 00:SCAN HDFS [functional.alltypes_date_partition] >partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE > '2009-02-01' >HDFS partitions=32/55 files=32 size=15.99KB >predicates: functional.alltypes_date_partition.bigint_col IN (5, 10), > functional.alltypes_date_partition.int_col < 100, > functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 > 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP > '2009-01-01 00:00:00', date_col = CAST(timestamp_col AS DATE) >row-size=65B cardinality=13 > > There's also 1 failure in > org.apache.impala.service.FrontendTest.TestGetTablesTypeTable which I didn't > quite understand yet. 1st failure was due to stats not present. On my local machine I think I had run the compute stats manually. I have added the alltypes_date_partition to the compute-table-stats.sh script. Second failure needed a minor update to the FrontendTest.java. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 13 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 16:38:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16346 to look at the new patch set (#13). Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. IMPALA-10064: Support constant propagation for eligible range predicates This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' AND date_col = CAST(timestamp_col as DATE) As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M testdata/bin/compute-table-stats.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test A testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test M tests/query_test/test_queries.py 11 files changed, 410 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/13 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 13 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9990: Support SET OWNER for Kudu tables
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/16273 ) Change subject: IMPALA-9990: Support SET OWNER for Kudu tables .. Patch Set 1: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/16273/1/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test: http://gerrit.cloudera.org:8080/#/c/16273/1/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test@51 PS1, Line 51: alter table simple set owner user non_owner > Does describe table list the owner? Thanks, Fang-Yu! http://gerrit.cloudera.org:8080/#/c/16273/1/testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test File testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test: http://gerrit.cloudera.org:8080/#/c/16273/1/testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test@53 PS1, Line 53: 'Owner has been altere.' > Thanks Attila! I see, thank you, Fang-Yu! -- To view, visit http://gerrit.cloudera.org:8080/16273 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29d641efc8db314964bc5ee9828a86d4a44ae95c Gerrit-Change-Number: 16273 Gerrit-PatchSet: 1 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 02 Sep 2020 15:38:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 12: > Patch Set 12: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/ 1 failure in org.apache.impala.planner.PlannerTest.testConstantPropagation: (this contains new tests added for this CR): seems related to ordering of the predicates and stats difference. The test passes on my dev machine. Actual does not match expected result: PLAN-ROOT SINK | 00:SCAN HDFS [functional.alltypes_date_partition] partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE '2009-02-01' HDFS partitions=32/55 files=32 size=15.99KB predicates: functional.alltypes_date_partition.int_col < 100, functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP '2009-01-01 00:00:00', functional.alltypes_date_partition.bigint_col IN (5, 10), date_col = CAST(timestamp_col AS DATE) ^^^ row-size=64B cardinality=26 Expected: PLAN-ROOT SINK | 00:SCAN HDFS [functional.alltypes_date_partition] partition predicates: date_col >= DATE '2009-01-01' AND date_col <= DATE '2009-02-01' HDFS partitions=32/55 files=32 size=15.99KB predicates: functional.alltypes_date_partition.bigint_col IN (5, 10), functional.alltypes_date_partition.int_col < 100, functional.alltypes_date_partition.timestamp_col <= TIMESTAMP '2009-02-01 00:00:00', functional.alltypes_date_partition.timestamp_col >= TIMESTAMP '2009-01-01 00:00:00', date_col = CAST(timestamp_col AS DATE) row-size=65B cardinality=13 There's also 1 failure in org.apache.impala.service.FrontendTest.TestGetTablesTypeTable which I didn't quite understand yet. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 12 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 15:33:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10106: Upgrade DataSketches to version 2.1.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16360 ) Change subject: IMPALA-10106: Upgrade DataSketches to version 2.1.0 .. Patch Set 6: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6388/ -- To view, visit http://gerrit.cloudera.org:8080/16360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4faa31c0b628a62c7e56a6c4b9549d0aaa8a02ff Gerrit-Change-Number: 16360 Gerrit-PatchSet: 6 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 15:03:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10071: Impala shouldn't create filename starting with underscore during ACID TRUNCATE
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16396 ) Change subject: IMPALA-10071: Impala shouldn't create filename starting with underscore during ACID TRUNCATE .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/16396/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16396/3//COMMIT_MSG@10 PS3, Line 10: Newer Hive versions > Can you mentions some Hive ticket with more information here? Sorry, the patch is merged, but the Hive Jira is HIVE-24021. -- To view, visit http://gerrit.cloudera.org:8080/16396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0557b9944624bc123c540752bbe3877312a7ac9 Gerrit-Change-Number: 16396 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 14:11:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16314 ) Change subject: IMPALA-7658: Proper codegen for HiveUdfCall .. Patch Set 13: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc File be/src/exprs/hive-udf-call.cc: http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@295 PS12, Line 295: llvm::Value* const child_is_null = child_wrapped.GetIsNull("child_is_null"); : llvm::Value* const child_is_null_i8 = builder->CreateZExtOrTrunc( : child_is_null, codegen->i8_type(), "child_is_null_i8"); : builder->CreateCall(set_input_null_buff_elem_fn, : {jni_ctx, codegen->GetI32Constant(i), child_is_null_i8}); : builder->CreateCondBr(child_is_null, next_eval_child_block, child_not_null_block); : : // Child is not null. : builder->SetInsertPoint(child_not_null_block); : llvm::Value* const input_ptr = builder->CreateCall(get_input_val_buff_ > Done Hmm, actually I didn't realize that the branch is still needed, so I am not sure which one is better. Feel free to keep whatever you prefer. -- To view, visit http://gerrit.cloudera.org:8080/16314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747 Gerrit-Change-Number: 16314 Gerrit-PatchSet: 13 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 13:39:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10071: Impala shouldn't create filename starting with underscore during ACID TRUNCATE
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16396 ) Change subject: IMPALA-10071: Impala shouldn't create filename starting with underscore during ACID TRUNCATE .. IMPALA-10071: Impala shouldn't create filename starting with underscore during ACID TRUNCATE When Impala TRUNCATEs an ACID table, it creates a new base directory with the hidden file "_empty" in it. Newer Hive versions ignore files starting with underscore, therefore they ignore the whole base directory. To resolve this issue we can simply rename the empty file to "empty". Testing: * update acid-truncate.test accordingly Change-Id: Ia0557b9944624bc123c540752bbe3877312a7ac9 Reviewed-on: http://gerrit.cloudera.org:8080/16396 Reviewed-by: Csaba Ringhofer Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test 2 files changed, 10 insertions(+), 2 deletions(-) Approvals: Csaba Ringhofer: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/16396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia0557b9944624bc123c540752bbe3877312a7ac9 Gerrit-Change-Number: 16396 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10071: Impala shouldn't create filename starting with underscore during ACID TRUNCATE
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16396 ) Change subject: IMPALA-10071: Impala shouldn't create filename starting with underscore during ACID TRUNCATE .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0557b9944624bc123c540752bbe3877312a7ac9 Gerrit-Change-Number: 16396 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 13:29:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10107:(1/3)Implement ds hll stringify function.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16382 ) Change subject: IMPALA-10107:(1/3)Implement ds_hll_stringify function. .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6387/ -- To view, visit http://gerrit.cloudera.org:8080/16382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I85dbf20b5114dd75c300eef0accabe90eac240a0 Gerrit-Change-Number: 16382 Gerrit-PatchSet: 3 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 13:14:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 28: I did a dockerised build to get some insights why test_create_iceberg_tables fails in dockerised tests. The test fails at "DESCRIBE iceberg_test_external_empty_column;" To get more insight I switched to "DESCRIBE formatted". You can check the output at https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3091/testReport/query_test.test_iceberg/TestCreatingIcebergTable/test_create_iceberg_tables_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/ I don't know yet why it couldn't figure out the table schema. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 28 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 02 Sep 2020 12:47:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5022 part 1/2: Outer join simplification
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022 part 1/2: Outer join simplification .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7076/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e Gerrit-Change-Number: 16266 Gerrit-PatchSet: 19 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Wed, 02 Sep 2020 12:39:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 12: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/ -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 12 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 12:34:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5022 part 1/2: Outer join simplification
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022 part 1/2: Outer join simplification .. Patch Set 19: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6389/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e Gerrit-Change-Number: 16266 Gerrit-PatchSet: 19 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Wed, 02 Sep 2020 12:19:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5022 part 1/2: Outer join simplification
Xianqing He has uploaded a new patch set (#19). ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022 part 1/2: Outer join simplification .. IMPALA-5022 part 1/2: Outer join simplification Outer joins in SQL can return rows with certain columns filled with NULLs when a match can not be found. However, such rows can be rejected by null-rejecting predicates. The conditions in a null-rejecting predicate that are always evaluated to FALSE for NULLs are referred to as null-filtering conditions. In general, an outer join can be converted to an inner join if there exist null-filtering conditions on the inner tables. In a left outer join, the right table is the inner table, while in a right outer join it is the left table. In a full outer join, both tables are inner tables. For example, 1. A LEFT JOIN B ON A.id = B.id WHERE B.v > 10 = A INNER JOIN B ON A.id = B.id WHERE B.v > 10 2. A RIGHT JOIN B ON A.id = B.id WHERE A.v > 10 = A INNER JOIN B ON A.id = B.id WHERE A.v > 10 3. A FULL JOIN B ON A.id = B.id WHERE A.v > 10 = A LEFT JOIN B ON A.id = B.id WHERE A.v > 10 4. A FULL JOIN B ON A.id = B.id WHERE B.v > 10 = A RIGHT JOIN B ON A.id = B.id WHERE B.v > 10 5. A FULL JOIN B ON A.id = B.id WHERE A.v > 10 AND B.v > 10 = A INNER JOIN B ON A.id = B.id WHERE A.v > 10 AND B.v > 10 6. A LEFT JOIN B ON A.id = B.id INNER JOIN C ON B.id = C.id = A INNER JOIN B ON A.id = B.id INNER JOIN C ON B.id = C.id 7. A RIGHT JOIN B ON A.id = B.id INNER JOIN C ON A.id = C.id = A INNER JOIN B ON A.id = B.id INNER JOIN C ON A.id = C.id 8. A FULL JOIN B ON A.id = B.id INNER JOIN C ON A.id = C.id = A LEFT JOIN B ON A.id = B.id INNER JOIN C ON A.id = C.id 9. A FULL JOIN B ON A.id = B.id INNER JOIN C ON B.id = C.id = A RIGHT JOIN B ON A.id = B.id INNER JOIN C ON B.id = C.id 10. A FULL JOIN B ON A.id = B.id INNER JOIN C ON A.id + B.id = C.id = A INNER JOIN B ON A.id = B.id INNER JOIN C ON A.id + B.id = C.id In this commit, we have supported most of the cases that can convert an outer join to an inner join, except for converting the embedding inline view outer join by the join condition like "SELECT * FROM T1 JOIN (SELECT T3.A A FROM T2 LEFT JOIN T3 ON T3.B=T2.B) T4 ON T4.A=T1.A". We will support it in part 2. Tests: * Update the baseline plan Tests * Add new plan tests outer-to-inner-joins.test * Add new query tests to verify the correctness on transformation * Ran the full set of verifications in Impala Public Jenkins Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test M testdata/workloads/functional-planner/queries/PlannerTest/card-outer-join.test M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test M testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test M testdata/workloads/functional-planner/queries/PlannerTest/joins-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-loop-join.test M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test A testdata/workloads/functional-planner/queries/PlannerTest/outer-to-inner-joins.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test M testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q49.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q93.test M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-p
[Impala-ASF-CR] IMPALA-10108: Implement ds kll stringify function
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16370 ) Change subject: IMPALA-10108: Implement ds_kll_stringify function .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I97f654a4838bf91e3e0bed6a00d78b2c7aa96f75 Gerrit-Change-Number: 16370 Gerrit-PatchSet: 8 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 10:49:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10108: Implement ds kll stringify function
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16370 ) Change subject: IMPALA-10108: Implement ds_kll_stringify function .. IMPALA-10108: Implement ds_kll_stringify function This function receives a string that is a serialized Apache DataSketches KLL sketch and returns its stringified format. A stringified format should look like and contains the following data: select ds_kll_stringify(ds_kll_sketch(float_col)) from functional_parquet.alltypestiny; ++ | ds_kll_stringify(ds_kll_sketch(float_col)) | ++ | ### KLL sketch summary:| |K : 200| |min K : 200| |M : 8 | |N : 8 | |Epsilon: 1.33% | |Epsilon PMF: 1.65% | |Empty : false | |Estimation mode: false | |Levels : 1 | |Sorted : false | |Capacity items : 200| |Retained items : 8 | |Storage bytes : 64 | |Min value : 0 | |Max value : 1.1| | ### End sketch summary | || ++ Change-Id: I97f654a4838bf91e3e0bed6a00d78b2c7aa96f75 Reviewed-on: http://gerrit.cloudera.org:8080/16370 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/exprs/datasketches-functions-ir.cc M be/src/exprs/datasketches-functions.h M common/function-registry/impala_functions.py M testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test 4 files changed, 59 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I97f654a4838bf91e3e0bed6a00d78b2c7aa96f75 Gerrit-Change-Number: 16370 Gerrit-PatchSet: 9 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16314 ) Change subject: IMPALA-7658: Proper codegen for HiveUdfCall .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7075/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747 Gerrit-Change-Number: 16314 Gerrit-PatchSet: 13 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 10:28:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10090 Pull newest code of native-toolchain before build it
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16402 ) Change subject: IMPALA-10090 Pull newest code of native-toolchain before build it .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7074/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16402 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2da3ffce7abb88190be0a5ea0e2cf603f98ee15e Gerrit-Change-Number: 16402 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 10:20:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/16314 ) Change subject: IMPALA-7658: Proper codegen for HiveUdfCall .. Patch Set 13: (5 comments) http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc File be/src/exprs/hive-udf-call.cc: http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc@266 PS10, Line 266: > I understand that you didn't add any new code to test, I'm saying you need I accept your point. I have added tests in testdata/workloads/functional-query/queries/QueryTest/java-udf.test with comments explaining them. Do you think they are ok? http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc File be/src/exprs/hive-udf-call.cc: http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@295 PS12, Line 295: llvm::Value* const child_is_null = child_wrapped.GetIsNull("child_is_null"); : llvm::Value* const child_is_null_i8 = builder->CreateZExtOrTrunc( : child_is_null, codegen->i8_type(), "child_is_null_i8"); : builder->CreateCall(set_input_null_buff_elem_fn, : {jni_ctx, codegen->GetI32Constant(i), child_is_null_i8}); : builder->CreateCondBr(child_is_null, next_eval_child_block, child_not_null_block); : : // Child is not null. : builder->SetInsertPoint(child_not_null_block); : llvm::Value* const input_ptr = builder->CreateCall(get_input_val_buff_ > optional: couldn't it be faster if the value of child_wrapped.GetIsNull("ch Done http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@311 PS12, Line 311: > nit: extra ; Done http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@490 PS12, Line 490: > Do we need to do this here, can't CallJavaAndStoreResult handle it, as jni_ Done http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@508 PS12, Line 508: : BooleanVal HiveUdfCall::GetBooleanValInterpreted( : ScalarExprEvaluator* eval, const TupleRow* row) const { : DCHECK_EQ(type_.type, TYPE_BOOLEAN); > Can't we move these as members to JniContext instead of passing them during Done -- To view, visit http://gerrit.cloudera.org:8080/16314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747 Gerrit-Change-Number: 16314 Gerrit-PatchSet: 13 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 10:06:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall
Daniel Becker has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/16314 ) Change subject: IMPALA-7658: Proper codegen for HiveUdfCall .. IMPALA-7658: Proper codegen for HiveUdfCall Implementing codegen for HiveUdfCall. Testing: Verified that java udf tests pass locally. Benchmarks: Used a UDF from TestUdf.java that adds three integers: create function tpch15_parquet.sum3(int, int, int) returns int location '/test-warehouse/impala-hive-udfs.jar' symbol='org.apache.impala.TestUdf'; Used the following query on the master branch and the change's branch: set num_nodes=1; set mt_dop=1; select min(tpch15_parquet.sum3(cast(l_orderkey as int), cast(l_partkey as int), cast(l_suppkey as int))) from tpch15_parquet.lineitem; Results averaged over 100 runs after warmup: Master: 20.6346s, stddev: 0.3132411856765332 This change: 19.0256s, stddev: 0.42039019873436 This is a ~7.8% improvement. Change-Id: I2f994dac550f297ed3c88491816403f237d4d747 --- M be/src/codegen/codegen-util.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exprs/CMakeLists.txt A be/src/exprs/hive-udf-call-ir.cc M be/src/exprs/hive-udf-call.cc M be/src/exprs/hive-udf-call.h A be/src/exprs/scalar-expr-evaluator-ir.cc M testdata/workloads/functional-query/queries/QueryTest/java-udf.test 11 files changed, 587 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/13 -- To view, visit http://gerrit.cloudera.org:8080/16314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747 Gerrit-Change-Number: 16314 Gerrit-PatchSet: 13 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10090 Pull newest code of native-toolchain before build it
huangtianhua...@gmail.com has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16402 Change subject: IMPALA-10090 Pull newest code of native-toolchain before build it .. IMPALA-10090 Pull newest code of native-toolchain before build it If native-toolchain exists we should pull the newest code before build it. Change-Id: I2da3ffce7abb88190be0a5ea0e2cf603f98ee15e --- M bin/bootstrap_system.sh 1 file changed, 3 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/16402/1 -- To view, visit http://gerrit.cloudera.org:8080/16402 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I2da3ffce7abb88190be0a5ea0e2cf603f98ee15e Gerrit-Change-Number: 16402 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward
[Impala-ASF-CR] IMPALA-10106: Upgrade DataSketches to version 2.1.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16360 ) Change subject: IMPALA-10106: Upgrade DataSketches to version 2.1.0 .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4faa31c0b628a62c7e56a6c4b9549d0aaa8a02ff Gerrit-Change-Number: 16360 Gerrit-PatchSet: 6 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 09:47:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10106: Upgrade DataSketches to version 2.1.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16360 ) Change subject: IMPALA-10106: Upgrade DataSketches to version 2.1.0 .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6388/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4faa31c0b628a62c7e56a6c4b9549d0aaa8a02ff Gerrit-Change-Number: 16360 Gerrit-PatchSet: 6 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 09:47:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10071: Impala shouldn't create filename starting with underscore during ACID TRUNCATE
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16396 ) Change subject: IMPALA-10071: Impala shouldn't create filename starting with underscore during ACID TRUNCATE .. Patch Set 3: Code-Review+2 (1 comment) lgtm, if this is what Hive wants http://gerrit.cloudera.org:8080/#/c/16396/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16396/3//COMMIT_MSG@10 PS3, Line 10: Newer Hive versions Can you mentions some Hive ticket with more information here? -- To view, visit http://gerrit.cloudera.org:8080/16396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0557b9944624bc123c540752bbe3877312a7ac9 Gerrit-Change-Number: 16396 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 09:43:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10106: Upgrade DataSketches to version 2.1.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16360 ) Change subject: IMPALA-10106: Upgrade DataSketches to version 2.1.0 .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6384/ -- To view, visit http://gerrit.cloudera.org:8080/16360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4faa31c0b628a62c7e56a6c4b9549d0aaa8a02ff Gerrit-Change-Number: 16360 Gerrit-PatchSet: 5 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 09:43:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10107:(1/3)Implement ds hll stringify function.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16382 ) Change subject: IMPALA-10107:(1/3)Implement ds_hll_stringify function. .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7073/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I85dbf20b5114dd75c300eef0accabe90eac240a0 Gerrit-Change-Number: 16382 Gerrit-PatchSet: 2 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 09:18:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10107:(1/3)Implement ds hll stringify function.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16382 ) Change subject: IMPALA-10107:(1/3)Implement ds_hll_stringify function. .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I85dbf20b5114dd75c300eef0accabe90eac240a0 Gerrit-Change-Number: 16382 Gerrit-PatchSet: 3 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 09:10:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10107:(1/3)Implement ds hll stringify function.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16382 ) Change subject: IMPALA-10107:(1/3)Implement ds_hll_stringify function. .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6387/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I85dbf20b5114dd75c300eef0accabe90eac240a0 Gerrit-Change-Number: 16382 Gerrit-PatchSet: 3 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 09:10:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10107:(1/3)Implement ds hll stringify function.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16382 ) Change subject: IMPALA-10107:(1/3)Implement ds_hll_stringify function. .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I85dbf20b5114dd75c300eef0accabe90eac240a0 Gerrit-Change-Number: 16382 Gerrit-PatchSet: 2 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 09:09:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10107:(1/3)Implement ds hll stringify function.
Adam Tamas has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/16382 ) Change subject: IMPALA-10107:(1/3)Implement ds_hll_stringify function. .. IMPALA-10107:(1/3)Implement ds_hll_stringify function. This function receives a string that is a serialized Apache DataSketches HLL sketch and returns its stringified format. A stringified format should look like and contains the following data: select ds_hll_stringify(ds_hll_sketch(float_col)) from functional_parquet.alltypestiny; ++ | ds_hll_stringify(ds_hll_sketch(float_col)) | ++ | ### HLL sketch summary:| | Log Config K : 12 | | Hll Target : HLL_4 | | Current Mode : LIST| | LB : 2 | | Estimate : 2 | | UB : 2.0001 | | OutOfOrder flag: false | | Coupon count : 2 | | ### End HLL sketch summary | || ++ Change-Id: I85dbf20b5114dd75c300eef0accabe90eac240a0 --- M be/src/exprs/datasketches-functions-ir.cc M be/src/exprs/datasketches-functions.h M common/function-registry/impala_functions.py M testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test 4 files changed, 59 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/16382/2 -- To view, visit http://gerrit.cloudera.org:8080/16382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I85dbf20b5114dd75c300eef0accabe90eac240a0 Gerrit-Change-Number: 16382 Gerrit-PatchSet: 2 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10107:(1/3)Implement ds hll stringify function.
Adam Tamas has posted comments on this change. ( http://gerrit.cloudera.org:8080/16382 ) Change subject: IMPALA-10107:(1/3)Implement ds_hll_stringify function. .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/16382/1/be/src/exprs/datasketches-functions.h File be/src/exprs/datasketches-functions.h: http://gerrit.cloudera.org:8080/#/c/16382/1/be/src/exprs/datasketches-functions.h@37 PS1, Line 37: > nit: leave an empty line before starting the comment. Done http://gerrit.cloudera.org:8080/#/c/16382/1/be/src/exprs/datasketches-functions.h@39 PS1, Line 39: then the query fai > DataSketches HLL sketch Done -- To view, visit http://gerrit.cloudera.org:8080/16382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I85dbf20b5114dd75c300eef0accabe90eac240a0 Gerrit-Change-Number: 16382 Gerrit-PatchSet: 2 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 08:56:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10107:(1/3)Implement ds hll stringify function.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16382 ) Change subject: IMPALA-10107:(1/3)Implement ds_hll_stringify function. .. Patch Set 1: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/16382/1/be/src/exprs/datasketches-functions.h File be/src/exprs/datasketches-functions.h: http://gerrit.cloudera.org:8080/#/c/16382/1/be/src/exprs/datasketches-functions.h@37 PS1, Line 37: /// 'serialized_sketch' is expected as a serialized Apache DataSketches HLL sketch. If nit: leave an empty line before starting the comment. http://gerrit.cloudera.org:8080/#/c/16382/1/be/src/exprs/datasketches-functions.h@39 PS1, Line 39: DataSketches sketch DataSketches HLL sketch -- To view, visit http://gerrit.cloudera.org:8080/16382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I85dbf20b5114dd75c300eef0accabe90eac240a0 Gerrit-Change-Number: 16382 Gerrit-PatchSet: 1 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 08:44:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10071: Impala shouldn't create filename starting with underscore during ACID TRUNCATE
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16396 ) Change subject: IMPALA-10071: Impala shouldn't create filename starting with underscore during ACID TRUNCATE .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6386/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia0557b9944624bc123c540752bbe3877312a7ac9 Gerrit-Change-Number: 16396 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 02 Sep 2020 08:18:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 28: Yeah, I was thinking about adding @classmethod def add_test_dimensions(cls): super(TestCreatingIcebergTable, cls).add_test_dimensions() cls.ImpalaTestMatrix.add_constraint( lambda v: v.get_value('table_format').file_format == 'parquet') to class TestCreatingIcebergTable. Since Parquet is kind of hard-coded to the tests, it has no additional value to run the tests with different file formats in the file format dimension. Regarding to the build error, it was caused by IMPALA-9923 which is an unrelated issue. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 28 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 02 Sep 2020 08:15:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8304: Generate JUnitXML if a command run by CMake fails
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12668 ) Change subject: IMPALA-8304: Generate JUnitXML if a command run by CMake fails .. Patch Set 6: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6382/ -- To view, visit http://gerrit.cloudera.org:8080/12668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If71f2faf3ab5052b56b38f1b291fee53c390ce23 Gerrit-Change-Number: 12668 Gerrit-PatchSet: 6 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 02 Sep 2020 08:12:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5022 part 1/2: Outer join simplification
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16266 ) Change subject: IMPALA-5022 part 1/2: Outer join simplification .. Patch Set 18: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6381/ -- To view, visit http://gerrit.cloudera.org:8080/16266 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e Gerrit-Change-Number: 16266 Gerrit-PatchSet: 18 Gerrit-Owner: Xianqing He Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Xianqing He Gerrit-Comment-Date: Wed, 02 Sep 2020 07:47:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6385/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 12 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 07:19:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 11: > Patch Set 10: > > Was able to find some time to look for local joins in TPCDS. There are not > many at all. > > query2.sql:WHERE d_week_seq1 = d_week_seq2 - 53 > query59.sql: AND d_week_seq1 = d_week_seq2 - 52 > query59.sql:WHERE s_store_id1 = s_store_id2 > > Sounds like they can be used for the min/max filtering at least. Thanks Qifan. Looking at query2, the d_week_seq1, d_week_seq2 come from different derived tables 'y' and 'z' even though the underlying base table is the same. From the planner perspective, they would be treated as a regular join. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 11 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 07:16:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7072/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 11 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 07:13:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16346 ) Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. Patch Set 11: (4 comments) > Patch Set 10: > > > Patch Set 10: > > > > Change looks good to me. I should wait for the e2e test you mentioned right? > > Thanks for the review. Yes, I plan to add at least 1 e2e test with the > modified dataset. I hope to do it later today after some other ongoing work. Made these changes. http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java: http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@95 PS10, Line 95: public static final com.google.common.base.Predicate > line too long (92 > 90) Done http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/16346/10/fe/src/main/java/org/apache/impala/analysis/Expr.java@1234 PS10, Line 1234: BitSet changed = propagateConstants(tmpConjuncts, candidates, keepConjuncts, > line too long (95 > 90) Done http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419 PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2009-02-01 00:00:00', timestamp_col >= TIMESTAMP '2009-01-01 00:00:00', date_col = CAST(timestamp_col AS DATE) > Made the code change to preserve the original conjunct date_col = cast(tim I changed the alltypes_date_partition table to have rows with odd values of 'id' to have matching timestamp_col and date_col values and even values of 'id' to have non-matching (I added an INTERVAL to the timestamp). The same e2e test from previous patchset is used. http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test File testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test: http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test@4 PS9, Line 4: alltypes_date_part > I would be ok with running with other data sets but I had some struggles in Removed the functional_parquet prefix. Also, I since these tests or the planner tests don't need parquet data, I modified the data generation script to only generate text version of alltypes_date_partition. Also reduced the size of the table so it does not run into the default maximum partitions limit. -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 11 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 02 Sep 2020 07:09:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates
Hello Qifan Chen, Shant Hovsepian, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16346 to look at the new patch set (#11). Change subject: IMPALA-10064: Support constant propagation for eligible range predicates .. IMPALA-10064: Support constant propagation for eligible range predicates This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' AND date_col = CAST(timestamp_col as DATE) As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java A fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test A testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test M tests/query_test/test_queries.py 9 files changed, 408 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16346/11 -- To view, visit http://gerrit.cloudera.org:8080/16346 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Gerrit-Change-Number: 16346 Gerrit-PatchSet: 11 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong