[Impala-ASF-CR] IMPALA-13505: Fix NPE in Calcite Planner
Hello Riza Suminto, Joe McDonnell, Steve Carlin, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22033 to look at the new patch set (#2). Change subject: IMPALA-13505: Fix NPE in Calcite Planner .. IMPALA-13505: Fix NPE in Calcite Planner Fixes the NullPointerException occurring when using the Calcite planner with test_tpcds_queries.py::TestTpcdsDecimalV2Query::test_tpcds_q8. The NPE was thrown from the Planner where it generates the list of columns in the query for use in the profile and workload management. Testing was accomplished by manually running the impacted the test and with a new custom cluster test that replicates the failing test. Change-Id: I4d282120e596fd39a569d1ce9b25024f4f174dd0 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M tests/custom_cluster/test_workload_mgmt_init.py M tests/custom_cluster/test_workload_mgmt_sql_details.py 3 files changed, 40 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/22033/2 -- To view, visit http://gerrit.cloudera.org:8080/22033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4d282120e596fd39a569d1ce9b25024f4f174dd0 Gerrit-Change-Number: 22033 Gerrit-PatchSet: 2 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Steve Carlin
[Impala-ASF-CR] IMPALA-13505: Fix NPE in Calcite Planner
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/22033 ) Change subject: IMPALA-13505: Fix NPE in Calcite Planner .. Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/22033/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/22033/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@4698 PS1, Line 4698: return; // Note, since this statement is within a lambda function, this return > I'd like to warn on these; I think they're issues in the Calcite parser we' The Calcite planner produces this join node: 05:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: substring(tpcds.customer_address.ca_zip, 1, 5) = EXPR$0 | fk/pk conjuncts: assumed fk/pk | runtime filters: RF002[bloom] <- EXPR$0 Since EXPR$0 is not a named column, it has a null resolved path and can be skipped. EXPR$0 refers to the list of zip codes use in the `WHERE SUBSTRING(ca_zip, 1, 5) IN` clause. This hash join node is produced by Calcite when the `IN` list grows beyond a certain point. http://gerrit.cloudera.org:8080/#/c/22033/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@4705 PS1, Line 4705: .getCanonicalPath() : .subList(0, 3) > Is it a good place to address my previous comment? I'd like to keep this patch focused on the Calcite planner NPE. I did incorporate this suggestion on another branch I have locally. http://gerrit.cloudera.org:8080/#/c/22033/1/tests/custom_cluster/test_workload_mgmt_sql_details.py File tests/custom_cluster/test_workload_mgmt_sql_details.py: http://gerrit.cloudera.org:8080/#/c/22033/1/tests/custom_cluster/test_workload_mgmt_sql_details.py@461 PS1, Line 461: res = client.execute("SELECT s_store_name, sum(ss_net_profit) FROM store_sales," > Can we refine this query at all and still hit the issue? Would help with de I was able to eliminate most of the zip codes from the IN list, but that was all I could eliminate and still hit the issue. http://gerrit.cloudera.org:8080/#/c/22033/1/tests/custom_cluster/test_workload_mgmt_sql_details.py@521 PS1, Line 521: > Should we verify that some columns were identified? The Calcite planner does not set any of the workload management related fields on the TExecRequest object it returns to the backend and thus all these columns are blank in the sys.impala_query_log table. Opened IMPALA-13519 to address this issue. -- To view, visit http://gerrit.cloudera.org:8080/22033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4d282120e596fd39a569d1ce9b25024f4f174dd0 Gerrit-Change-Number: 22033 Gerrit-PatchSet: 2 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Wed, 06 Nov 2024 20:31:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-915: Support cancel queries during planning
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21803 ) Change subject: IMPALA-915: Support cancel queries during planning .. Patch Set 10: (2 comments) http://gerrit.cloudera.org:8080/#/c/21803/10/be/src/runtime/query-driver.cc File be/src/runtime/query-driver.cc: http://gerrit.cloudera.org:8080/#/c/21803/10/be/src/runtime/query-driver.cc@496 PS10, Line 496: QueryHandle* query_handle, const Status* cause) { Please mention in the commit message why the removal of check_inflight will not re-introduce the issue fixed in https://github.com/apache/impala/commit/54642b2549c38625c868f09815b61170e568cdb9 http://gerrit.cloudera.org:8080/#/c/21803/10/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/21803/10/fe/src/main/java/org/apache/impala/service/Frontend.java@2050 PS10, Line 2050: // 2. This thread clears execRequestThreads_ and cancelledThreads_ I'm not seeing where this case happens. -- To view, visit http://gerrit.cloudera.org:8080/21803 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d25d4c7fb0b8dcc7dad9510db1e8dca220eeb86 Gerrit-Change-Number: 21803 Gerrit-PatchSet: 10 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 06 Nov 2024 16:20:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-915: Support cancel queries during planning
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21803 ) Change subject: IMPALA-915: Support cancel queries during planning .. Patch Set 10: (1 comment) This was my first pass through, thus I want to do another pass through the code tomorrow. http://gerrit.cloudera.org:8080/#/c/21803/10/fe/src/main/java/org/apache/impala/service/JniFrontend.java File fe/src/main/java/org/apache/impala/service/JniFrontend.java: http://gerrit.cloudera.org:8080/#/c/21803/10/fe/src/main/java/org/apache/impala/service/JniFrontend.java@195 PS10, Line 195: frontend_.cancelExecRequest(queryId); cancelExecRequest is a static method and should be called statically: Frontend.cancelExecRequest(... -- To view, visit http://gerrit.cloudera.org:8080/21803 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d25d4c7fb0b8dcc7dad9510db1e8dca220eeb86 Gerrit-Change-Number: 21803 Gerrit-PatchSet: 10 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 05 Nov 2024 22:42:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13505: Fix NPE in Calcite Planner
Jason Fehr has uploaded this change for review. ( http://gerrit.cloudera.org:8080/22033 Change subject: IMPALA-13505: Fix NPE in Calcite Planner .. IMPALA-13505: Fix NPE in Calcite Planner Fixes the NullPointerException occurring when using the Calcite planner with test_tpcds_queries.py::TestTpcdsDecimalV2Query::test_tpcds_q8. The NPE was thrown from the Planner where it generates the list of columns in the query for use in the profile and workload management. Testing was accomplished by manually running the impacted the test and with a new custom cluster test that replicates the failing test. Change-Id: I4d282120e596fd39a569d1ce9b25024f4f174dd0 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M tests/custom_cluster/test_workload_mgmt_init.py M tests/custom_cluster/test_workload_mgmt_sql_details.py 3 files changed, 87 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/22033/1 -- To view, visit http://gerrit.cloudera.org:8080/22033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4d282120e596fd39a569d1ce9b25024f4f174dd0 Gerrit-Change-Number: 22033 Gerrit-PatchSet: 1 Gerrit-Owner: Jason Fehr
[Impala-ASF-CR] IMPALA-12737: (addendum) Turn Off Log Buffering in Workload Management Init Tests
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/22021 ) Change subject: IMPALA-12737: (addendum) Turn Off Log Buffering in Workload Management Init Tests .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/22021/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22021/3//COMMIT_MSG@7 PS3, Line 7: IMPALA-12737: (addendum) Turn Off Log Buffering in Workload Management > nit: Done -- To view, visit http://gerrit.cloudera.org:8080/22021 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03ac0f0f00c93fe785db131278a706e3f5e975c2 Gerrit-Change-Number: 22021 Gerrit-PatchSet: 5 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 05 Nov 2024 17:49:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: (addendum) Turn Off Log Buffering in Workload Management Init Tests
Hello Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22021 to look at the new patch set (#5). Change subject: IMPALA-12737: (addendum) Turn Off Log Buffering in Workload Management Init Tests .. IMPALA-12737: (addendum) Turn Off Log Buffering in Workload Management Init Tests Fixes the issue where custom cluster workload management tests do not disable glog log buffering in tests that wait for specific messages to be logged from the coordinators and catalogs. By default, logs are buffered up to 30 seconds. This buffering can cause unnecessary test slowness while the tests wait longer than needed for the expected log message to be flushed and can also cause flakiness where the tests do not find the expected log message before the timeout expires. Change-Id: I03ac0f0f00c93fe785db131278a706e3f5e975c2 --- M tests/custom_cluster/test_workload_mgmt_init.py M tests/custom_cluster/test_workload_mgmt_sql_details.py 2 files changed, 33 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/22021/5 -- To view, visit http://gerrit.cloudera.org:8080/22021 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I03ac0f0f00c93fe785db131278a706e3f5e975c2 Gerrit-Change-Number: 22021 Gerrit-PatchSet: 5 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12737: (addendum) Turn Off Log Buffering in Workload Management Init Tests
Hello Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22021 to look at the new patch set (#4). Change subject: IMPALA-12737: (addendum) Turn Off Log Buffering in Workload Management Init Tests .. IMPALA-12737: (addendum) Turn Off Log Buffering in Workload Management Init Tests Fixes the issue where custom cluster workload management init tests do not disable glog log buffering when restarting the test cluster. The custom cluster workload management init tests all wait for workload management to be initialized as indicated by a log message being written by the catalogd. By default, logs are buffered up to 30 seconds. This buffering can cause unnecessary test slowness while the tests wait for the expected log message and can also cause flakiness where the tests do not find the expected log message before the timeout expires. Change-Id: I03ac0f0f00c93fe785db131278a706e3f5e975c2 --- M tests/custom_cluster/test_workload_mgmt_init.py M tests/custom_cluster/test_workload_mgmt_sql_details.py 2 files changed, 33 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/22021/4 -- To view, visit http://gerrit.cloudera.org:8080/22021 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I03ac0f0f00c93fe785db131278a706e3f5e975c2 Gerrit-Change-Number: 22021 Gerrit-PatchSet: 4 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12737: Turn Off Log Buffering in Workload Management Init Tests
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/22021 ) Change subject: IMPALA-12737: Turn Off Log Buffering in Workload Management Init Tests .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/22021/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22021/2//COMMIT_MSG@9 PS2, Line 9: Fixes the issue where custom cluster workload management init tests > Can you explain more clearly why this is an issue? Does it make test startu Done -- To view, visit http://gerrit.cloudera.org:8080/22021 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03ac0f0f00c93fe785db131278a706e3f5e975c2 Gerrit-Change-Number: 22021 Gerrit-PatchSet: 3 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 05 Nov 2024 15:51:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Turn Off Log Buffering in Workload Management Init Tests
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/22021 ) Change subject: IMPALA-12737: Turn Off Log Buffering in Workload Management Init Tests .. Patch Set 3: Verified+1 Moving forward verification +1 since the only change in patchset 3 was the commit message. -- To view, visit http://gerrit.cloudera.org:8080/22021 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03ac0f0f00c93fe785db131278a706e3f5e975c2 Gerrit-Change-Number: 22021 Gerrit-PatchSet: 3 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 05 Nov 2024 15:52:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12737: Turn Off Log Buffering in Workload Management Init Tests
Hello Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22021 to look at the new patch set (#3). Change subject: IMPALA-12737: Turn Off Log Buffering in Workload Management Init Tests .. IMPALA-12737: Turn Off Log Buffering in Workload Management Init Tests Fixes the issue where custom cluster workload management init tests do not disable glog log buffering when restarting the test cluster. The custom cluster workload management init tests all wait for workload management to be initialized as indicated by a log message being written by the catalogd. By default, logs are buffered up to 30 seconds. This buffering can cause unnecessary test slowness while the tests wait for the expected log message and can also cause flakiness where the tests do not find the expected log message before the timeout expires. Change-Id: I03ac0f0f00c93fe785db131278a706e3f5e975c2 --- M tests/custom_cluster/test_workload_mgmt_init.py 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/22021/3 -- To view, visit http://gerrit.cloudera.org:8080/22021 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I03ac0f0f00c93fe785db131278a706e3f5e975c2 Gerrit-Change-Number: 22021 Gerrit-PatchSet: 3 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13507: Allow disabling glog buffering via with args fixture
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/22015 ) Change subject: IMPALA-13507: Allow disabling glog buffering via with_args fixture .. Patch Set 8: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/22015/5/tests/common/custom_cluster_test_suite.py File tests/common/custom_cluster_test_suite.py: http://gerrit.cloudera.org:8080/#/c/22015/5/tests/common/custom_cluster_test_suite.py@175 PS5, Line 175: if reset_ranger: > I search git log and found no history of None assignment. Thanks for checking and finding we both came to the same conclusion. -- To view, visit http://gerrit.cloudera.org:8080/22015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56fb1746b8f3cea9f3db3514a86a526dffb44a61 Gerrit-Change-Number: 22015 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 04 Nov 2024 20:05:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Turn Off Log Buffering in Workload Management Init Tests
Jason Fehr has uploaded this change for review. ( http://gerrit.cloudera.org:8080/22021 Change subject: IMPALA-12737: Turn Off Log Buffering in Workload Management Init Tests .. IMPALA-12737: Turn Off Log Buffering in Workload Management Init Tests Fixes the issue where custom cluster workload management init tests do not disable glog log buffering when restarting the test cluster. Change-Id: I03ac0f0f00c93fe785db131278a706e3f5e975c2 --- M tests/custom_cluster/test_workload_mgmt_init.py 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/22021/1 -- To view, visit http://gerrit.cloudera.org:8080/22021 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I03ac0f0f00c93fe785db131278a706e3f5e975c2 Gerrit-Change-Number: 22021 Gerrit-PatchSet: 1 Gerrit-Owner: Jason Fehr
[Impala-ASF-CR] IMPALA-13507: Allow disabling glog buffering via with args fixture
Jason Fehr has uploaded a new patch set (#7) to the change originally created by Riza Suminto. ( http://gerrit.cloudera.org:8080/22015 ) Change subject: IMPALA-13507: Allow disabling glog buffering via with_args fixture .. IMPALA-13507: Allow disabling glog buffering via with_args fixture We have plenty of custom_cluster tests that assert against content of Impala daemon log files while the process is still running using assert_log_contains() and it's wrappers. The method specifically mention about disabling glog buffering ('-logbuflevel=-1'), but not all custom_cluster tests do that. This often result in flaky test that hard to triage and often neglected if it does not frequently run in core exploration. This patch adds boolean param 'disable_log_buffering' into CustomClusterTestSuite.with_args for test to declare intention to inspect log files in live minicluster. If it is True, start minicluster with '-logbuflevel=-1' for all daemons. If it is False, log WARNING on any calls to assert_log_contains(). There are several complex custom_cluster tests that left unchanged and print out such WARNING logs, such as: - TestQueryLive - TestQueryLogTableBeeswax - TestQueryLogOtherTable - TestQueryLogTableHS2 - TestQueryLogTableAll - TestQueryLogTableBufferPool - TestStatestoreRpcErrors - TestWorkloadManagementInitWait - TestWorkloadManagementSQLDetails This patch also fixed some small flake8 issues on modified tests. test_query_live.py is modified to wait for few seconds until sys.impala_query_live is queryable. Testing: - Pass custom_cluster tests in exhaustive exploration. Change-Id: I56fb1746b8f3cea9f3db3514a86a526dffb44a61 --- M tests/authorization/test_authorization.py M tests/authorization/test_ranger.py M tests/common/custom_cluster_test_suite.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_catalog_hms_failures.py M tests/custom_cluster/test_catalogd_ha.py M tests/custom_cluster/test_codegen_cache.py M tests/custom_cluster/test_custom_statestore.py M tests/custom_cluster/test_data_cache.py M tests/custom_cluster/test_disable_catalog_data_ops.py M tests/custom_cluster/test_events_custom_configs.py M tests/custom_cluster/test_local_catalog.py M tests/custom_cluster/test_logging.py M tests/custom_cluster/test_partition.py M tests/custom_cluster/test_pause_monitor.py M tests/custom_cluster/test_query_event_hooks.py M tests/custom_cluster/test_query_expiration.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py M tests/custom_cluster/test_query_retries.py M tests/custom_cluster/test_re2_max_mem.py M tests/custom_cluster/test_restart_services.py M tests/custom_cluster/test_runtime_filter_aggregation.py M tests/custom_cluster/test_services_rpc_errors.py M tests/custom_cluster/test_shell_jwt_auth.py M tests/custom_cluster/test_thrift_socket.py M tests/custom_cluster/test_workload_mgmt_init.py 28 files changed, 162 insertions(+), 80 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/22015/7 -- To view, visit http://gerrit.cloudera.org:8080/22015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I56fb1746b8f3cea9f3db3514a86a526dffb44a61 Gerrit-Change-Number: 22015 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13507: Allow disabling glog buffering via with args fixture
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/22015 ) Change subject: IMPALA-13507: Allow disabling glog buffering via with_args fixture .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/22015/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22015/5//COMMIT_MSG@36 PS5, Line 36: test_query_live.py is modified to wait for few seconds until Please explain the reason behind this modification. http://gerrit.cloudera.org:8080/#/c/22015/5/tests/common/custom_cluster_test_suite.py File tests/common/custom_cluster_test_suite.py: http://gerrit.cloudera.org:8080/#/c/22015/5/tests/common/custom_cluster_test_suite.py@175 PS5, Line 175: if reset_ranger: This change (and also for impalad_graceful_shutdown) is a backwards-compatibility breaking change because it switches the handling of `None`. Previously, if either parameter was `None`, then the code used `True` (since None is not False). Now, the uses `False` since the if statements evaluate `None` as `False`. I wanted to call this change out for discussion, but I do not see any problem with it because: 1. No test sets `reset_ranger` or `impalad_graceful_shutdown` to `None`. Thus, all tests will continue working as is. 2. This is test code. 3. Using a value of `False` for `None` makes more sense. http://gerrit.cloudera.org:8080/#/c/22015/5/tests/common/impala_test_suite.py File tests/common/impala_test_suite.py: http://gerrit.cloudera.org:8080/#/c/22015/5/tests/common/impala_test_suite.py@1370 PS5, Line 1370: LOG.warning( Please add details on setting `disable_log_buffering=True` in custom cluster tests. -- To view, visit http://gerrit.cloudera.org:8080/22015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56fb1746b8f3cea9f3db3514a86a526dffb44a61 Gerrit-Change-Number: 22015 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 04 Nov 2024 19:18:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 64: (3 comments) http://gerrit.cloudera.org:8080/#/c/21142/67/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/21142/67/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@4661 PS67, Line 4661: recursively walking > Maybe change into BFS instead of DFS. To me, recursion is easier to understand than a loop in non-tail recursion cases. I don't foresee any issues with stack overflow exceptions. Do you think we need to be concerned about those exceptions happening? What advantages would a BFS have over depth first? http://gerrit.cloudera.org:8080/#/c/21142/67/fe/src/main/java/org/apache/impala/analysis/Path.java File fe/src/main/java/org/apache/impala/analysis/Path.java: http://gerrit.cloudera.org:8080/#/c/21142/67/fe/src/main/java/org/apache/impala/analysis/Path.java@513 PS67, Line 513: currentType = rootDesc_.getType(); > There are many nullable fields around SlotRef, SlotDesc, Path, TupleDesc, e That would definitely be helpful. http://gerrit.cloudera.org:8080/#/c/21142/64/tests/common/impala_test_suite.py File tests/common/impala_test_suite.py: http://gerrit.cloudera.org:8080/#/c/21142/64/tests/common/impala_test_suite.py@1419 PS64, Line 1419: wait_for_log_exists > Or disable it entirely, I tried turning off log buffering, and it did not speed up the tests at all. Thus, I am not going to tackle this change in this patch. Another patch for sure though. -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 64 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 31 Oct 2024 16:02:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active/completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Additionally, workload management can be set up to run under a schema version that is not the latest. This ability will be useful during troubleshooting. To enable these upgrades, the workload management initialization that manages the structure of the tables has been moved to the catalogd. The changes in this patch must be backwards compatible so that Impala clusters running previous workload management code can co-exist with Impala clusters running this workload management code. To enable that backwards compatibility, a new table property named 'wm_schema_version' is now used to track the schema version of the workload management tables. Thus, the old property 'schema_version' will always be set to '1.0.0' since modifying that property value causes Impala running previous workload management code to error at startup. Testing accomplished by * Adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. * JUnit tests added to verify the new workload management columns are being correctly parsed. * GTests added to ensure the workload management columns are correctly defined and in the correct order. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Reviewed-on: http://gerrit.cloudera.org:8080/21142 Reviewed-by: Michael Smith Tested-by: Impala Public Jenkins Reviewed-by: Riza Suminto --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc A be/src/catalog/workload-management-init.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc A be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.cc D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/t
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 67: (1 comment) http://gerrit.cloudera.org:8080/#/c/21142/67/fe/src/main/java/org/apache/impala/analysis/Path.java File fe/src/main/java/org/apache/impala/analysis/Path.java: http://gerrit.cloudera.org:8080/#/c/21142/67/fe/src/main/java/org/apache/impala/analysis/Path.java@513 PS67, Line 513: // Unnest expressions of columns from views have a root description with a > This is to address test_zipping_unnest_from_view failing? Yes. That test was failing because the call to rootDesc_.getPath().getCanonicalPath(result) threw a NullPointerException since the rootDesc_ path was null. -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 67 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 30 Oct 2024 21:58:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#67) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active/completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Additionally, workload management can be set up to run under a schema version that is not the latest. This ability will be useful during troubleshooting. To enable these upgrades, the workload management initialization that manages the structure of the tables has been moved to the catalogd. The changes in this patch must be backwards compatible so that Impala clusters running previous workload management code can co-exist with Impala clusters running this workload management code. To enable that backwards compatibility, a new table property named 'wm_schema_version' is now used to track the schema version of the workload management tables. Thus, the old property 'schema_version' will always be set to '1.0.0' since modifying that property value causes Impala running previous workload management code to error at startup. Testing accomplished by * Adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. * JUnit tests added to verify the new workload management columns are being correctly parsed. * GTests added to ensure the workload management columns are correctly defined and in the correct order. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc A be/src/catalog/workload-management-init.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc A be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.cc D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_te
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 66: (1 comment) Patch 66 was a rebase of master branch. http://gerrit.cloudera.org:8080/#/c/21142/64/tests/common/impala_test_suite.py File tests/common/impala_test_suite.py: http://gerrit.cloudera.org:8080/#/c/21142/64/tests/common/impala_test_suite.py@1437 PS64, Line 1437: "come into existence in the allowed timeframe".format(actual_log_path) > nit: mention max_attempts & sleep_time_s here. Say, Done -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 66 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 30 Oct 2024 15:48:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#66) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active/completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Additionally, workload management can be set up to run under a schema version that is not the latest. This ability will be useful during troubleshooting. To enable these upgrades, the workload management initialization that manages the structure of the tables has been moved to the catalogd. The changes in this patch must be backwards compatible so that Impala clusters running previous workload management code can co-exist with Impala clusters running this workload management code. To enable that backwards compatibility, a new table property named 'wm_schema_version' is now used to track the schema version of the workload management tables. Thus, the old property 'schema_version' will always be set to '1.0.0' since modifying that property value causes Impala running previous workload management code to error at startup. Testing accomplished by * Adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. * JUnit tests added to verify the new workload management columns are being correctly parsed. * GTests added to ensure the workload management columns are correctly defined and in the correct order. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc A be/src/catalog/workload-management-init.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc A be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.cc D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_te
[Impala-ASF-CR] IMPALA-13340: Fix missing partitions in COPY TESTCASE of LocalCatalog mode
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21864 ) Change subject: IMPALA-13340: Fix missing partitions in COPY TESTCASE of LocalCatalog mode .. Patch Set 5: Code-Review+2 (4 comments) http://gerrit.cloudera.org:8080/#/c/21864/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21864/4//COMMIT_MSG@39 PS4, Line 39: uses Nit: used http://gerrit.cloudera.org:8080/#/c/21864/4//COMMIT_MSG@45 PS4, Line 45: thus not exist Not: should be "thus do not exist" http://gerrit.cloudera.org:8080/#/c/21864/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/21864/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1109 PS2, Line 1109: %s", part) > This is helpful to print what the partition in the response looks like. So Done http://gerrit.cloudera.org:8080/#/c/21864/2/tests/metadata/test_testcase_builder.py File tests/metadata/test_testcase_builder.py: http://gerrit.cloudera.org:8080/#/c/21864/2/tests/metadata/test_testcase_builder.py@70 PS2, Line 70: {"PLANNER_TESTCASE_MODE": True}) > Partition ids are not exposed to the clients. But we can get it from the "/ Done -- To view, visit http://gerrit.cloudera.org:8080/21864 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icc2e8b71564ad37973ddfca92801afea8e26ff73 Gerrit-Change-Number: 21864 Gerrit-PatchSet: 5 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 29 Oct 2024 21:45:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 64: (1 comment) http://gerrit.cloudera.org:8080/#/c/21142/63/be/src/workload_mgmt/workload-management.h File be/src/workload_mgmt/workload-management.h: http://gerrit.cloudera.org:8080/#/c/21142/63/be/src/workload_mgmt/workload-management.h@46 PS63, Line 46: la::Status Startup > This comment need an update. It does not parse FLAG anymore. Done. -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 64 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 29 Oct 2024 18:49:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#64) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active/completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Additionally, workload management can be set up to run under a schema version that is not the latest. This ability will be useful during troubleshooting. To enable these upgrades, the workload management initialization that manages the structure of the tables has been moved to the catalogd. The changes in this patch must be backwards compatible so that Impala clusters running previous workload management code can co-exist with Impala clusters running this workload management code. To enable that backwards compatibility, a new table property named 'wm_schema_version' is now used to track the schema version of the workload management tables. Thus, the old property 'schema_version' will always be set to '1.0.0' since modifying that property value causes Impala running previous workload management code to error at startup. Testing accomplished by * Adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. * JUnit tests added to verify the new workload management columns are being correctly parsed. * GTests added to ensure the workload management columns are correctly defined and in the correct order. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc A be/src/catalog/workload-management-init.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc A be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.cc D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_te
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#63) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active/completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Additionally, workload management can be set up to run under a schema version that is not the latest. This ability will be useful during troubleshooting. To enable these upgrades, the workload management initialization that manages the structure of the tables has been moved to the catalogd. The changes in this patch must be backwards compatible so that Impala clusters running previous workload management code can co-exist with Impala clusters running this workload management code. To enable that backwards compatibility, a new table property named 'wm_schema_version' is now used to track the schema version of the workload management tables. Thus, the old property 'schema_version' will always be set to '1.0.0' since modifying that property value causes Impala running previous workload management code to error at startup. Testing accomplished by * Adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. * JUnit tests added to verify the new workload management columns are being correctly parsed. * GTests added to ensure the workload management columns are correctly defined and in the correct order. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc A be/src/catalog/workload-management-init.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc A be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.cc D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_te
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 62: (4 comments) Eliminated potential race condition http://gerrit.cloudera.org:8080/#/c/21142/62/be/src/workload_mgmt/workload-management.h File be/src/workload_mgmt/workload-management.h: http://gerrit.cloudera.org:8080/#/c/21142/62/be/src/workload_mgmt/workload-management.h@41 PS62, Line 41: impala::Status ParseSchemaVersionFlag(kudu::Version& target_schema_version); > nit: As a matter of convention, Impala code passes values that will be modi Ah, I thought I had seen where we switched to passing byref instead of passing pointers. Will switch back to pointers. http://gerrit.cloudera.org:8080/#/c/21142/62/be/src/workload_mgmt/workload-management.cc File be/src/workload_mgmt/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/21142/62/be/src/workload_mgmt/workload-management.cc@181 PS62, Line 181: /// associated mutex. > Associated mutex no longer exists. Let's instead say it should not be modif Done http://gerrit.cloudera.org:8080/#/c/21142/62/be/src/workload_mgmt/workload-management.cc@181 PS62, Line 181: /// associated mutex. > I think we might need to make this atomic. StartupChecks happens on the wor Yes, I'm puzzling through that case. It's highly unlikely but could happen. http://gerrit.cloudera.org:8080/#/c/21142/62/be/src/workload_mgmt/workload-management.cc@205 PS62, Line 205: // specified, parses the flag value into the parsed_target_schema_version variable. > This comments not correct anymore. Done -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 62 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 29 Oct 2024 18:00:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#62) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active/completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Additionally, workload management can be set up to run under a schema version that is not the latest. This ability will be useful during troubleshooting. To enable these upgrades, the workload management initialization that manages the structure of the tables has been moved to the catalogd. The changes in this patch must be backwards compatible so that Impala clusters running previous workload management code can co-exist with Impala clusters running this workload management code. To enable that backwards compatibility, a new table property named 'wm_schema_version' is now used to track the schema version of the workload management tables. Thus, the old property 'schema_version' will always be set to '1.0.0' since modifying that property value causes Impala running previous workload management code to error at startup. Testing accomplished by * Adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. * JUnit tests added to verify the new workload management columns are being correctly parsed. * GTests added to ensure the workload management columns are correctly defined and in the correct order. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc A be/src/catalog/workload-management-init.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc A be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.cc D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_te
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#61) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active/completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Additionally, workload management can be set up to run under a schema version that is not the latest. This ability will be useful during troubleshooting. To enable these upgrades, the workload management initialization that manages the structure of the tables has been moved to the catalogd. The changes in this patch must be backwards compatible so that Impala clusters running previous workload management code can co-exist with Impala clusters running this workload management code. To enable that backwards compatibility, a new table property named 'wm_schema_version' is now used to track the schema version of the workload management tables. Thus, the old property 'schema_version' will always be set to '1.0.0' since modifying that property value causes Impala running previous workload management code to error at startup. Testing accomplished by * Adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. * JUnit tests added to verify the new workload management columns are being correctly parsed. * GTests added to ensure the workload management columns are correctly defined and in the correct order. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc A be/src/catalog/workload-management-init.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc A be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.cc D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_te
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 60: (1 comment) http://gerrit.cloudera.org:8080/#/c/21142/60/be/src/catalog/workload-management-init.cc File be/src/catalog/workload-management-init.cc: http://gerrit.cloudera.org:8080/#/c/21142/60/be/src/catalog/workload-management-init.cc@517 PS60, Line 517: : // Set the default hostname. The user can override this with the hostname flag. : RETURN_IF_ERROR(GetHostname(&FLAGS_hostname)); > It is interesting that only webserver.cc check if FLAGS_hostname is empty o Hmm, good question. I need to do more digging into this question. Setting the --hostname flag does something because I can see in the info logs and the coordinator backends UI where the value of the hostname flag is used. I'm wondering if most situations where hostname is referenced actually need an IP address, and thus it does not matter if using the value of the --hostname flag or the system default hostname since they will both resolve to the same IP address, unless running on a server with multiple IPs. -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 60 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 25 Oct 2024 22:45:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 58: (1 comment) http://gerrit.cloudera.org:8080/#/c/21142/58/be/src/catalog/workload-management-init.cc File be/src/catalog/workload-management-init.cc: http://gerrit.cloudera.org:8080/#/c/21142/58/be/src/catalog/workload-management-init.cc@524 PS58, Line 524: Status ip_status = HostnameToIpAddr(FLAGS_hostname, &ip_addr); > Looks like FLAGS_hostname can be empty. Can you check the implication from Good catch! I copied code from init.cc that sets the hostname if FLAGS_hostname is empty. -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 58 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 25 Oct 2024 16:52:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#60) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active/completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Additionally, workload management can be set up to run under a schema version that is not the latest. This ability will be useful during troubleshooting. To enable these upgrades, the workload management initialization that manages the structure of the tables has been moved to the catalogd. The changes in this patch must be backwards compatible so that Impala clusters running previous workload management code can co-exist with Impala clusters running this workload management code. To enable that backwards compatibility, a new table property named 'wm_schema_version' is now used to track the schema version of the workload management tables. Thus, the old property 'schema_version' will always be set to '1.0.0' since modifying that property value causes Impala running previous workload management code to error at startup. Testing accomplished by * Adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. * JUnit tests added to verify the new workload management columns are being correctly parsed. * GTests added to ensure the workload management columns are correctly defined and in the correct order. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc A be/src/catalog/workload-management-init.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc A be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.cc D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_te
[Impala-ASF-CR] IMPALA-13477: Set request pool in QueryStateRecord for CTAS query
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21975 ) Change subject: IMPALA-13477: Set request_pool in QueryStateRecord for CTAS query .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21975 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6885192f1c2d563e58670f142b3c0df528032a6e Gerrit-Change-Number: 21975 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 25 Oct 2024 16:37:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#58) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active/completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Additionally, workload management can be set up to run under a schema version that is not the latest. This ability will be useful during troubleshooting. To enable these upgrades, the workload management initialization that manages the structure of the tables has been moved to the catalogd. The changes in this patch must be backwards compatible so that Impala clusters running previous workload management code can co-exist with Impala clusters running this workload management code. To enable that backwards compatibility, a new table property named 'wm_schema_version' is now used to track the schema version of the workload management tables. Thus, the old property 'schema_version' will always be set to '1.0.0' since modifying that property value causes Impala running previous workload management code to error at startup. Testing accomplished by * Adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. * JUnit tests added to verify the new workload management columns are being correctly parsed. * GTests added to ensure the workload management columns are correctly defined and in the correct order. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc A be/src/catalog/workload-management-init.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc A be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.cc D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_te
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 57: (4 comments) http://gerrit.cloudera.org:8080/#/c/21142/56/be/src/workload_mgmt/workload-management-fields-test.cc File be/src/workload_mgmt/workload-management-fields-test.cc: http://gerrit.cloudera.org:8080/#/c/21142/56/be/src/workload_mgmt/workload-management-fields-test.cc@103 PS56, Line 103: : vector expected_field_defs = { : _createV100String(TQueryTableColumn::CLUSTER_ID), : _cr > What I meant before is to contain all tests cases in vector and loop them t The reason I went with incrementing an iterator is that the [] operator on maps takes a key, not an index. I re-wrote the test using an iterator for the map and an index variable for the list of expected values. I like the new code much better because it reduces duplication. http://gerrit.cloudera.org:8080/#/c/21142/56/be/src/workload_mgmt/workload-management-test.cc File be/src/workload_mgmt/workload-management-test.cc: http://gerrit.cloudera.org:8080/#/c/21142/56/be/src/workload_mgmt/workload-management-test.cc@26 PS56, Line 26: #include "kudu/util/version_util.h" > This is a weird pattern. I'd rather keep a public definition for GetTargetS I agree it's strange. It's also an anti-pattern to unit test private/internal methods. I refactored the code to remove the _getTargetSchemaVersion() function. http://gerrit.cloudera.org:8080/#/c/21142/56/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/21142/56/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@4602 PS56, Line 4602: Path}s ignoring any {@li > nit: "to dot delimited strings" Done. Re-wrote this description since it did not exactly document what the function did. http://gerrit.cloudera.org:8080/#/c/21142/55/tests/common/custom_cluster_test_suite.py File tests/common/custom_cluster_test_suite.py: http://gerrit.cloudera.org:8080/#/c/21142/55/tests/common/custom_cluster_test_suite.py@286 PS55, Line 286: assert not method.__dict__.get(EXPECT_STARTUP_FAIL, False), \ : "Expected cluster startup to fail, but startup succeeded." > It just my personal preference, I suppose. Python does not have semicolon t In situations like this, I seek to match existing patterns in the code base. The pattern is to use backslashes instead of parentheses, thus I prefer to stick with the code as-is. -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 57 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 24 Oct 2024 20:35:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13477: Set request pool in QueryStateRecord for CTAS query
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21975 ) Change subject: IMPALA-13477: Set request_pool in QueryStateRecord for CTAS query .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21975/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21975/1//COMMIT_MSG@16 PS1, Line 16: - Pass core tests. Please also note that existing tests were updated to assert CTAS queries are now showing in the /queries data. -- To view, visit http://gerrit.cloudera.org:8080/21975 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6885192f1c2d563e58670f142b3c0df528032a6e Gerrit-Change-Number: 21975 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 24 Oct 2024 20:41:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#57) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active/completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Additionally, workload management can be set up to run under a schema version that is not the latest. This ability will be useful during troubleshooting. To enable these upgrades, the workload management initialization that manages the structure of the tables has been moved to the catalogd. The changes in this patch must be backwards compatible so that Impala clusters running previous workload management code can co-exist with Impala clusters running this workload management code. To enable that backwards compatibility, a new table property named 'wm_schema_version' is now used to track the schema version of the workload management tables. Thus, the old property 'schema_version' will always be set to '1.0.0' since modifying that property value causes Impala running previous workload management code to error at startup. Testing accomplished by * Adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. * JUnit tests added to verify the new workload management columns are being correctly parsed. * GTests added to ensure the workload management columns are correctly defined and in the correct order. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc A be/src/catalog/workload-management-init.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc A be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.cc D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_te
[Impala-ASF-CR] IMPALA-12648: Add KILL QUERY statement
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21930 ) Change subject: IMPALA-12648: Add KILL QUERY statement .. Patch Set 4: (27 comments) http://gerrit.cloudera.org:8080/#/c/21930/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21930/2//COMMIT_MSG@15 PS2, Line 15: KILL QUERY '123:456'; > Yes. I think that is possible and makes sense. Looking at this again, the Jira description has a good outline of the desired syntax. Please add comments there with an specific questions about the syntax. http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/client-request-state.h@960 PS2, Line 960: > Thanks! Removed. Done http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/client-request-state.cc@336 PS2, Line 336: case TStmtType::UNKNOWN: > Thanks! Changed. Done http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/client-request-state.cc@2454 PS2, Line 2454: // The current impalad is the coordinator of the query. > Thanks! Added code to try killing the query locally. Done http://gerrit.cloudera.org:8080/#/c/21930/4/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/21930/4/be/src/service/client-request-state.cc@2446 PS4, Line 2446: string_view Is a new import needed for this object? http://gerrit.cloudera.org:8080/#/c/21930/4/be/src/service/client-request-state.cc@2446 PS4, Line 2446: optional Missing #import http://gerrit.cloudera.org:8080/#/c/21930/4/be/src/service/client-request-state.cc@2446 PS4, Line 2446: std:: Coding standard is to not use namespaces in .cc files unless required (due to name collisions across namespaces imported with using namespace). http://gerrit.cloudera.org:8080/#/c/21930/4/be/src/service/client-request-state.cc@2476 PS4, Line 2476: if (exec_request().kill_query_request.__isset.requesting_user) { Can the optional variable requesting_user be reused here? http://gerrit.cloudera.org:8080/#/c/21930/4/be/src/service/client-request-state.cc@2483 PS4, Line 2483: all_coordinators.GetAllExecutorDescriptors()) { May be able to simplify the code by leveraging the GetFilteredExecutorGroup() function from executor-group.h combined with ExecEnv::GetInstance()->krpc_address() http://gerrit.cloudera.org:8080/#/c/21930/4/be/src/service/client-request-state.cc@2527 PS4, Line 2527: return results.back(); Can we return a slightly more descriptive response here, something like "could not find query on any coordinator"? http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/control-service.h File be/src/service/control-service.h: http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/control-service.h@81 PS2, Line 81: ::kudu::rp > Thanks! Changed. Done http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/control-service.cc File be/src/service/control-service.cc: http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/control-service.cc@276 PS2, Line 276: Respond > Thanks! This is what I want to discuss with reviewers. Done http://gerrit.cloudera.org:8080/#/c/21930/2/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/21930/2/common/protobuf/control_service.proto@467 PS2, Line 467: message KillQueryRequestPB { > Thanks! I changed them to `repeated` so that we can pass multiple query ids Done http://gerrit.cloudera.org:8080/#/c/21930/2/common/protobuf/control_service.proto@467 PS2, Line 467: message KillQueryRequestPB { > Please consider making this object more future proof by modifying the data Done http://gerrit.cloudera.org:8080/#/c/21930/2/common/protobuf/control_service.proto@469 PS2, Line 469: repeated UniqueIdPB query_ids = 1; > Thanks! Changed to `repeated` as explained above. Done http://gerrit.cloudera.org:8080/#/c/21930/2/common/protobuf/control_service.proto@474 PS2, Line 474: repeated > Should be required. Done http://gerrit.cloudera.org:8080/#/c/21930/2/common/thrift/Types.thrift File common/thrift/Types.thrift: http://gerrit.cloudera.org:8080/#/c/21930/2/common/thrift/Types.thrift@113 PS2, Line 113: UNKNOWN = 9 > Thanks! Changed. Done http://gerrit.cloudera.org:8080/#/c/21930/2/fe/src/main/java/org/apache/impala/analysis/KillQueryStmt.java File fe/src/main/java/org/apache/impala/analysis/KillQueryStmt.java: http://gerrit.cloudera.org:8080/#/c/21930/2/fe/src/main/java/org/apache/impala/analysis/KillQueryStmt.java@34 PS2, Line 34: requestedByAdmin_ = true > Thanks! I think that makes sense. But currently in KillQueryStmt we will no Ok, I better understand what is happening now. That variabl
[Impala-ASF-CR](asf-site) BLOG: Add blog post for Impala talks at Community Over Code NA 2024
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21929 ) Change subject: BLOG: Add blog post for Impala talks at Community Over Code NA 2024 .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/21929 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: comment Gerrit-Change-Id: I3beb67c6c209443703cbf3e9521c8b9f662ef733 Gerrit-Change-Number: 21929 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 18 Oct 2024 15:46:53 + Gerrit-HasComments: No
[Impala-ASF-CR](asf-site) BLOG: Add blog post for Impala talks at Community Over Code NA 2024
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21929 ) Change subject: BLOG: Add blog post for Impala talks at Community Over Code NA 2024 .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/21929 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: comment Gerrit-Change-Id: I3beb67c6c209443703cbf3e9521c8b9f662ef733 Gerrit-Change-Number: 21929 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 18 Oct 2024 15:42:46 + Gerrit-HasComments: No
[Impala-ASF-CR](asf-site) BLOG: Add blog post for Impala talks at Community Over Code NA 2024
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21929 ) Change subject: BLOG: Add blog post for Impala talks at Community Over Code NA 2024 .. Patch Set 5: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21929 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: comment Gerrit-Change-Id: I3beb67c6c209443703cbf3e9521c8b9f662ef733 Gerrit-Change-Number: 21929 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 17 Oct 2024 19:26:03 + Gerrit-HasComments: No
[Impala-ASF-CR](asf-site) Add blog post for Impala talks at Community Over Code NA 2024
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21929 ) Change subject: Add blog post for Impala talks at Community Over Code NA 2024 .. Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/21929/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21929/2//COMMIT_MSG@7 PS2, Line 7: Add blog post for Impala talks at Community Over Code NA 2024 Can we prefix these commits with wording that makes it easy to identify blog related changes (similar to how we prefix code changes with the Jira id)? Possible ideas are "BLOG: " or "SITE: " http://gerrit.cloudera.org:8080/#/c/21929/2/nikola_site_generator/posts/healing-iceberg-tables-with-impala.md File nikola_site_generator/posts/healing-iceberg-tables-with-impala.md: http://gerrit.cloudera.org:8080/#/c/21929/2/nikola_site_generator/posts/healing-iceberg-tables-with-impala.md@2 PS2, Line 2: t The 'T' should be capitalized. http://gerrit.cloudera.org:8080/#/c/21929/2/nikola_site_generator/posts/impalas-living-on-iceberg.md File nikola_site_generator/posts/impalas-living-on-iceberg.md: http://gerrit.cloudera.org:8080/#/c/21929/2/nikola_site_generator/posts/impalas-living-on-iceberg.md@2 PS2, Line 2: l The 'L' needs to be capitalized. http://gerrit.cloudera.org:8080/#/c/21929/2/nikola_site_generator/posts/impalas-living-on-iceberg.md@2 PS2, Line 2: Impalas The word "Impalas" is plural. I think this word should be "Impala's" unless the title is intended to refer to multiple Impala instances. -- To view, visit http://gerrit.cloudera.org:8080/21929 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: comment Gerrit-Change-Id: I3beb67c6c209443703cbf3e9521c8b9f662ef733 Gerrit-Change-Number: 21929 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 17 Oct 2024 18:58:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13340: Fix missing partitions in COPY TESTCASE of LocalCatalog mode
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21864 ) Change subject: IMPALA-13340: Fix missing partitions in COPY TESTCASE of LocalCatalog mode .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/21864/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21864/2//COMMIT_MSG@15 PS2, Line 15: to Nit: remove word http://gerrit.cloudera.org:8080/#/c/21864/2//COMMIT_MSG@46 PS2, Line 46: unpartitiond Nit: unpartitioned http://gerrit.cloudera.org:8080/#/c/21864/2/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/21864/2/fe/src/main/java/org/apache/impala/catalog/Table.java@610 PS2, Line 610: <> Nit: was the removal of TColumn needed? If not, I favor leaving this line unchanged even though the empty angle brackets <> is newer syntax. http://gerrit.cloudera.org:8080/#/c/21864/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/21864/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1109 PS2, Line 1109: %s", part) Was this a temporary debugging change or is it needed? http://gerrit.cloudera.org:8080/#/c/21864/2/tests/metadata/test_testcase_builder.py File tests/metadata/test_testcase_builder.py: http://gerrit.cloudera.org:8080/#/c/21864/2/tests/metadata/test_testcase_builder.py@70 PS2, Line 70: assert len(res.data) == 24 Is there any way to assert the partition ids changed? Possibly by exporting the table again and checking partition ids? -- To view, visit http://gerrit.cloudera.org:8080/21864 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icc2e8b71564ad37973ddfca92801afea8e26ff73 Gerrit-Change-Number: 21864 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 16 Oct 2024 22:24:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13395: Adds USE APACHE COMPONENTS=true in all-build-options job
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21842 ) Change subject: IMPALA-13395: Adds USE_APACHE_COMPONENTS=true in all-build-options job .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21842 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica516a7554bfe9fa0710b5a437c302934a13c08d Gerrit-Change-Number: 21842 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 16 Oct 2024 21:56:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12648: Add KILL QUERY statement
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21930 ) Change subject: IMPALA-12648: Add KILL QUERY statement .. Patch Set 2: (16 comments) Good start! http://gerrit.cloudera.org:8080/#/c/21930/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21930/2//COMMIT_MSG@15 PS2, Line 15: KILL QUERY '123:456'; Will it be possible to specify multiple query ids to kill or even specifying a subquery to build the list of query ids to kill? For example: kill query where query_id in (select query_id from sys.impala_query_live where query_State='RUNNING') http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/client-request-state.h@960 PS2, Line 960: WARN_UNUSED_RESULT Shouldn't need WARN_UNUSED_RESULT since Status is already marked [[nodiscard]] http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/client-request-state.cc@336 PS2, Line 336: case TStmtType::KILL: Nit: reorder below UNKNOWN to match Thrift enum. http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/client-request-state.cc@2454 PS2, Line 2454: ExecutorGroup all_coordinators = There is a good chance the kill query commands will be run on the coordinator that owns the query. The code should try killing the query locally first before moving on to the RPCs to the other coordinators. http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/control-service.h File be/src/service/control-service.h: http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/control-service.h@81 PS2, Line 81: RpcContext Nit: fully qualify to match existing code conventions in this file. http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/control-service.cc File be/src/service/control-service.cc: http://gerrit.cloudera.org:8080/#/c/21930/2/be/src/service/control-service.cc@276 PS2, Line 276: // TODO: Note: unaddressed TODO FWIW, I favor not waiting for FinishUnregisterQuery() since I do not see any benefits to the end user of waiting additional time for that processing to run. http://gerrit.cloudera.org:8080/#/c/21930/2/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/21930/2/common/protobuf/control_service.proto@467 PS2, Line 467: message KillQueryRequestPB { Please consider making this object more future proof by modifying the data structure to enable multiple queries owned by multiple users to be killed with one RPC. The KillQueryResponsePB will need modifying too. http://gerrit.cloudera.org:8080/#/c/21930/2/common/protobuf/control_service.proto@469 PS2, Line 469: optional UniqueIdPB query_id = 1; Should be required. http://gerrit.cloudera.org:8080/#/c/21930/2/common/protobuf/control_service.proto@474 PS2, Line 474: optional Should be required. http://gerrit.cloudera.org:8080/#/c/21930/2/common/thrift/Types.thrift File common/thrift/Types.thrift: http://gerrit.cloudera.org:8080/#/c/21930/2/common/thrift/Types.thrift@113 PS2, Line 113: KILL = 9 Don't modify the values of existing Thrift enums. KILL must be listed last in the enum and be equal to 10. http://gerrit.cloudera.org:8080/#/c/21930/2/fe/src/main/java/org/apache/impala/analysis/KillQueryStmt.java File fe/src/main/java/org/apache/impala/analysis/KillQueryStmt.java: http://gerrit.cloudera.org:8080/#/c/21930/2/fe/src/main/java/org/apache/impala/analysis/KillQueryStmt.java@34 PS2, Line 34: requestedByAdmin_ = true Defaulting to true makes me nervous. A better security posture would be to default to false and explicitly set to true after admin privileges are confirmed. http://gerrit.cloudera.org:8080/#/c/21930/2/tests/authorization/__init__.py File tests/authorization/__init__.py: http://gerrit.cloudera.org:8080/#/c/21930/2/tests/authorization/__init__.py@1 PS2, Line 1: This file appears to have been accidentally committed. http://gerrit.cloudera.org:8080/#/c/21930/2/tests/custom_cluster/test_kill_query_custom_cluster.py File tests/custom_cluster/test_kill_query_custom_cluster.py: http://gerrit.cloudera.org:8080/#/c/21930/2/tests/custom_cluster/test_kill_query_custom_cluster.py@39 PS2, Line 39: def get_workload(cls): No need to define get_workload function. http://gerrit.cloudera.org:8080/#/c/21930/2/tests/custom_cluster/test_kill_query_custom_cluster.py@71 PS2, Line 71: def test_kill_as_non_admin(self): Please also add a test where a non-admin user successfully kills its own query. http://gerrit.cloudera.org:8080/#/c/21930/2/tests/query_test/__init__.py File tests/query_test/__init__.py: http://gerrit.cloudera.org:8080/#/c/21930/2/tests/query_test/__init__.py@1 PS2, Line 1: File appears to have been
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 52: (1 comment) Patch is ready for review. The catalod is now used to create/upgrade the workload management database tables. The coordinator has no involvement with this initialization work. http://gerrit.cloudera.org:8080/#/c/21142/50/be/CMakeLists.txt File be/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/21142/50/be/CMakeLists.txt@641 PS50, Line 641: WorkloadMgmt Not sure if this lib needs to be listed here too. -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 52 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 09 Oct 2024 22:10:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#52) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active/completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Additionally, workload management can be set up to run under a schema version that is not the latest. This ability will be useful during troubleshooting. The changes in this patch must be backwards compatible so that Impala clusters running previous workload management code can co-exist with Impala clusters running this workload management code. To enable that backwards compatibility, a new table property named 'wm_schema_version' is now used to track the schema version of the workload management tables. Thus, the old property 'schema_version' will always be set to '1.0.0' since modifying that property value causes Impala running previous workload management code to error at startup. Testing accomplished by * Adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. * JUnit tests added to verify the new workload management columns are being correctly parsed. * GTests added to ensure the workload management columns are correctly defined and in the correct order. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc A be/src/catalog/workload-management-init.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc R be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_test/test_observability.py M tests/util/workload_management.py 58 files changed, 4,113 insertions(+), 1,198 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/52 -- To view, vi
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#51) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active/completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Additionally, workload management can be set up to run under a schema version that is not the latest. This ability will be useful during troubleshooting. The changes in this patch must be backwards compatible so that Impala clusters running previous workload management code can co-exist with Impala clusters running this workload management code. To enable that backwards compatibility, a new table property named 'wm_schema_version' is now used to track the schema version of the workload management tables. Thus, the old property 'schema_version' will always be set to '1.0.0' since modifying that property value causes Impala running previous workload management code to error at startup. Testing accomplished by * Adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. * JUnit tests added to verify the new workload management columns are being correctly parsed. * GTests added to ensure the workload management columns are correctly defined and in the correct order. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc A be/src/catalog/workload-management-init.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc R be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_test/test_observability.py M tests/util/workload_management.py 58 files changed, 4,097 insertions(+), 1,198 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/51 -- To view, vi
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 44: (5 comments) http://gerrit.cloudera.org:8080/#/c/21142/41/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/21142/41/be/src/service/impala-server.h@1661 PS41, Line 1661: boost::scoped_ptr hs2_http_server_; > It's still there in the latest patchset. It has been removed for real this time. http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management-init.cc File be/src/service/workload-management-init.cc: http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management-init.cc@354 PS40, Line 354: // concurrently on all coordinators. Running the same create and alter table statements > This is something we'll want to think more about in how we deploy Impala. All initialization of the workload management 'sys' db and live/log tables is now done in the catalog. http://gerrit.cloudera.org:8080/#/c/21142/49/be/src/service/workload-management-init.cc File be/src/service/workload-management-init.cc: http://gerrit.cloudera.org:8080/#/c/21142/49/be/src/service/workload-management-init.cc@105 PS49, Line 105: static void _setupTable(InternalServer* internal_server, > Was there a specific reason to rename this variable? I'm ok with modifying variable names now since workload management is not a well established feature, but after this change, I don't want there to be any more variable name modifications for precisely the reason of git history readability. http://gerrit.cloudera.org:8080/#/c/21142/41/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/21142/41/be/src/service/workload-management.cc@65 PS41, Line 65: static list _completed_queries; > There's a difference between static functions (which have no side effects) This use of the "static" keyword when declaring variables came out of my misunderstanding what that meant when they keyword was applied to variables. My intention was to limit the scope of the variables to just this file. I removed the "static" keyword on the variables. http://gerrit.cloudera.org:8080/#/c/21142/44/be/src/util/string-join.h File be/src/util/string-join.h: http://gerrit.cloudera.org:8080/#/c/21142/44/be/src/util/string-join.h@39 PS44, Line 39: string JoinToString(const Container& container, const string delimiter, > This could probably just go in string-util.h. I completely removed this file since the JoinToString method was only used in 1 place to generate an error message in a very unlikely situation. -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 44 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 08 Oct 2024 22:51:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#50) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active and completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Testing was accomplished by adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. JUnit tests were also added to verify the new workload management columns are being correctly parsed. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h A be/src/catalog/catalog-workload-mgmt-init.cc M be/src/catalog/catalogd-main.cc M be/src/exec/system-table-scanner.cc M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h D be/src/service/workload-management-fields.cc D be/src/service/workload-management-init.cc D be/src/service/workload-management-test.cc A be/src/service/workload-management-worker-test.cc R be/src/service/workload-management-worker.cc A be/src/service/workload-management-worker.h D be/src/service/workload-management.h M be/src/util/version-util-test.cc M be/src/util/version-util.cc M be/src/util/version-util.h A be/src/workload_mgmt/CMakeLists.txt A be/src/workload_mgmt/workload-management-fields-test.cc R be/src/workload_mgmt/workload-management-flags.cc A be/src/workload_mgmt/workload-management-test.cc A be/src/workload_mgmt/workload-management.cc A be/src/workload_mgmt/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java A testdata/workload_mgmt/create_impala_query_live_table.sql A testdata/workload_mgmt/create_impala_query_log_table.sql R testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-live-v1.1.0.test R testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.0.0.test A testdata/workloads/functional-query/queries/QueryTest/workload-management-log-v1.1.0.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_test/test_observability.py M tests/util/workload_management.py 59 files changed, 4,037 insertions(+), 1,198 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/50 -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 50 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13408: use a specific flag for the topic prefix cluster identifier.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21867 ) Change subject: IMPALA-13408: use a specific flag for the topic prefix cluster identifier. .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/21867/5/tests/custom_cluster/test_query_live.py File tests/custom_cluster/test_query_live.py: http://gerrit.cloudera.org:8080/#/c/21867/5/tests/custom_cluster/test_query_live.py@281 PS5, Line 281: @CustomClusterTestSuite.with_args( > But why does this test need cluster_membership_topic_id? It doesn't seem re Also, the cluster_membership_topic_id flag is supposed to be optional. -- To view, visit http://gerrit.cloudera.org:8080/21867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd3f7e1c73c00a7aaeee79ecb461209e3939c422 Gerrit-Change-Number: 21867 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 04 Oct 2024 20:31:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13408: use a specific flag for the topic prefix cluster identifier.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21867 ) Change subject: IMPALA-13408: use a specific flag for the topic prefix cluster identifier. .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/21867/5/tests/custom_cluster/test_query_live.py File tests/custom_cluster/test_query_live.py: http://gerrit.cloudera.org:8080/#/c/21867/5/tests/custom_cluster/test_query_live.py@281 PS5, Line 281: @CustomClusterTestSuite.with_args( Why did this one particular test require the new flag to be set? -- To view, visit http://gerrit.cloudera.org:8080/21867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd3f7e1c73c00a7aaeee79ecb461209e3939c422 Gerrit-Change-Number: 21867 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 04 Oct 2024 19:24:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13408: use a specific flag for the topic prefix cluster identifier.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21867 ) Change subject: IMPALA-13408: use a specific flag for the topic prefix cluster identifier. .. Patch Set 2: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/21867/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21867/2//COMMIT_MSG@16 PS2, Line 16: An important difference is that the query_log cluster_id must be set > I guess I don't understand how "cluster_id must be set only on coordinators There is nothing stopping any arbitrary flag from being set on any daemon. If that particular daemon does not use a flag, then it is ignored. The issue we are trying to prevent is a split brain problem when cluster_id does not get specified everywhere. It's a difficult situation to detect and diagnose. Another potential issue is the users specifying the same cluster_id across multiple clusters which would result in cross-communication between two clusters. TL;DR -- 'cluster_id' flag is meaningful to humans while 'cluster_membership_topic_id' flag is meaningful to cluster coordination. http://gerrit.cloudera.org:8080/#/c/21867/2/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/21867/2/be/src/scheduling/admission-controller.cc@82 PS2, Line 82: cluster_id > It still exists, defined in workload-management-flags.cc 'cluster_id' flag is meaningful to humans and used in workload management while 'cluster_membership_topic_id' flag is meaningful to cluster coordination. -- To view, visit http://gerrit.cloudera.org:8080/21867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd3f7e1c73c00a7aaeee79ecb461209e3939c422 Gerrit-Change-Number: 21867 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 02 Oct 2024 22:20:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12937: Deflake test admission controller.py
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21813 ) Change subject: IMPALA-12937: Deflake test_admission_controller.py .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21813 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9cf2bddcbfcd63d3a6bbc0a2014774a910f6e730 Gerrit-Change-Number: 21813 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 26 Sep 2024 14:41:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12937: Deflake test admission controller.py
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21813 ) Change subject: IMPALA-12937: Deflake test_admission_controller.py .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/21813/2/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/21813/2/tests/custom_cluster/test_admission_controller.py@815 PS2, Line 815: assert re.search(r"Queued reason: Not enough memory available on host \S+.Needed " : "2.00 GB but only 1.00 GB out of 2.00 GB was available.", str(e)), str(e) Best practice is to also put the 'r' marker on line 816: r"2.00 GB but only 1.00 GB out of 2.00 GB was available.", str(e)), str(e) -- To view, visit http://gerrit.cloudera.org:8080/21813 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9cf2bddcbfcd63d3a6bbc0a2014774a910f6e730 Gerrit-Change-Number: 21813 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 25 Sep 2024 21:33:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13312: Use client address from X-Forwarded-For Header in Ranger Audit Logs
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21780 ) Change subject: IMPALA-13312: Use client address from X-Forwarded-For Header in Ranger Audit Logs .. Patch Set 3: Code-Review+1 (3 comments) http://gerrit.cloudera.org:8080/#/c/21780/2/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/21780/2/be/src/rpc/authentication.cc@151 PS2, Line 151: DEFINE_bool(use_xff_address_as_origin, false, > Done Done http://gerrit.cloudera.org:8080/#/c/21780/2/tests/authorization/test_ranger.py File tests/authorization/test_ranger.py: http://gerrit.cloudera.org:8080/#/c/21780/2/tests/authorization/test_ranger.py@95 PS2, Line 95: args = ['--protocol=hs2-http', > Done. Note that we don't validate the IP address but just read it as is fro Done http://gerrit.cloudera.org:8080/#/c/21780/2/tests/common/custom_cluster_test_suite.py File tests/common/custom_cluster_test_suite.py: http://gerrit.cloudera.org:8080/#/c/21780/2/tests/common/custom_cluster_test_suite.py@388 PS2, Line 388: > Done Done -- To view, visit http://gerrit.cloudera.org:8080/21780 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib784ad805c649e9576ef34f125509c904b7773ab Gerrit-Change-Number: 21780 Gerrit-PatchSet: 3 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Comment-Date: Tue, 24 Sep 2024 21:06:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13312: Use client address from X-Forwarded-For Header in Ranger Audit Logs
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21780 ) Change subject: IMPALA-13312: Use client address from X-Forwarded-For Header in Ranger Audit Logs .. Patch Set 2: (3 comments) Overall, very solid. A few questions/comments to look at. http://gerrit.cloudera.org:8080/#/c/21780/2/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/21780/2/be/src/rpc/authentication.cc@151 PS2, Line 151: DEFINE_bool(use_xff_address_as_origin, false, Since the X-Forwarded-For header is non-standard, please add some wording about the value of this header must be a comma separated list of IP addresses. http://gerrit.cloudera.org:8080/#/c/21780/2/tests/authorization/test_ranger.py File tests/authorization/test_ranger.py: http://gerrit.cloudera.org:8080/#/c/21780/2/tests/authorization/test_ranger.py@95 PS2, Line 95: '--hs2_x_forward=10.20.30.40, 1.1.2.3, 127.0.0.6', Please add a couple more test cases to ensure string parsing works as expected: * space before the first ip address * no space between IP addresses (e.g. 10.20.30.40,1.1.2.3,127.0.0.6) * Empty X-Forward-For header * Invalid X-Forward-For header (e.g. header value of "foobar") http://gerrit.cloudera.org:8080/#/c/21780/2/tests/common/custom_cluster_test_suite.py File tests/common/custom_cluster_test_suite.py: http://gerrit.cloudera.org:8080/#/c/21780/2/tests/common/custom_cluster_test_suite.py@388 PS2, Line 388: def assert_log_file(log_dir, log_file, expected_strings, log_keyword=""): Nit: A better place for this method (if it is being kept) is in impala_test_suite.py where the other assert_log methods are located. -- To view, visit http://gerrit.cloudera.org:8080/21780 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib784ad805c649e9576ef34f125509c904b7773ab Gerrit-Change-Number: 21780 Gerrit-PatchSet: 2 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Comment-Date: Thu, 19 Sep 2024 19:16:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21782 ) Change subject: IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/21782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I94b2fd59ed5aca664d6db4448c61ad21a88a4f98 Gerrit-Change-Number: 21782 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Mon, 16 Sep 2024 16:30:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21782 ) Change subject: IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles .. Patch Set 4: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/21782/3/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java: http://gerrit.cloudera.org:8080/#/c/21782/3/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java@170 PS3, Line 170: > Done. Moved these to VirtualTable. They are created by the Analyzer and don Ah, I misunderstood the purpose of the VirtualTable class. Makes sense to me now to move these functions up to the VirtualTable class. http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/service/Frontend.java@2585 PS2, Line 2585: // Add the catalog versions and loaded timestamps. > A table like ExecSummary adds additional padding. Keeps the current format Sounds good to me. Done -- To view, visit http://gerrit.cloudera.org:8080/21782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I94b2fd59ed5aca664d6db4448c61ad21a88a4f98 Gerrit-Change-Number: 21782 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 13 Sep 2024 15:14:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21782 ) Change subject: IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles .. Patch Set 3: -Code-Review -- To view, visit http://gerrit.cloudera.org:8080/21782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I94b2fd59ed5aca664d6db4448c61ad21a88a4f98 Gerrit-Change-Number: 21782 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 12 Sep 2024 23:21:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21782 ) Change subject: IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/21782/3/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java: http://gerrit.cloudera.org:8080/#/c/21782/3/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java@349 PS3, Line 349: public long getCatalogVersion() { return 0; } > Should this be done in CtasTargetTable? I agree with this suggestion since the ctas target should always be a table of some sort that needs to be versioned by the catalog. Adding these functions in the parent class should result in new subclasses automatically getting this functionality. -- To view, visit http://gerrit.cloudera.org:8080/21782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I94b2fd59ed5aca664d6db4448c61ad21a88a4f98 Gerrit-Change-Number: 21782 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 12 Sep 2024 23:21:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21782 ) Change subject: IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/21782/3/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java: http://gerrit.cloudera.org:8080/#/c/21782/3/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java@170 PS3, Line 170: @Override > Could these be done in VirtualTable to cover several subclasses? I considered making the same suggestion, but I ended up deciding that I prefer not implementing these functions in VirtualTable. That way, if someone adds a new subclass, they are forced by the compiler to consider how their subclass should handle this functionality. -- To view, visit http://gerrit.cloudera.org:8080/21782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I94b2fd59ed5aca664d6db4448c61ad21a88a4f98 Gerrit-Change-Number: 21782 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 12 Sep 2024 23:16:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13322: Fix alter on SystemTables
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21792 ) Change subject: IMPALA-13322: Fix alter on SystemTables .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/21792/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21792/3//COMMIT_MSG@12 PS3, Line 12: > Done Done -- To view, visit http://gerrit.cloudera.org:8080/21792 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a59e58c086e659941e0db8a2b893ac6dcc5143a Gerrit-Change-Number: 21792 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 12 Sep 2024 23:14:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13322: Fix alter on SystemTables
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21792 ) Change subject: IMPALA-13322: Fix alter on SystemTables .. Patch Set 6: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/21792/3/tests/custom_cluster/test_query_live.py File tests/custom_cluster/test_query_live.py: http://gerrit.cloudera.org:8080/#/c/21792/3/tests/custom_cluster/test_query_live.py@216 PS3, Line 216: def test_alter(self): > That's what happens now, see the latest version. Done -- To view, visit http://gerrit.cloudera.org:8080/21792 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a59e58c086e659941e0db8a2b893ac6dcc5143a Gerrit-Change-Number: 21792 Gerrit-PatchSet: 6 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 12 Sep 2024 23:13:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13322: Fix alter on SystemTables
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21792 ) Change subject: IMPALA-13322: Fix alter on SystemTables .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/21792/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21792/3//COMMIT_MSG@12 PS3, Line 12: Please add info on how this patch was tested. http://gerrit.cloudera.org:8080/#/c/21792/3/tests/custom_cluster/test_query_live.py File tests/custom_cluster/test_query_live.py: http://gerrit.cloudera.org:8080/#/c/21792/3/tests/custom_cluster/test_query_live.py@216 PS3, Line 216: def test_alter(self): I would prefer if this entire test was wrapped in a try..finally code block where the finally either executes alter table sys.impala_query_live drop column test_alter or drops the entire sys.impala_query_live table to ensure this table is not left in an invalid state. -- To view, visit http://gerrit.cloudera.org:8080/21792 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a59e58c086e659941e0db8a2b893ac6dcc5143a Gerrit-Change-Number: 21792 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 12 Sep 2024 20:47:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21782 ) Change subject: IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles .. Patch Set 3: Code-Review+1 (7 comments) http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java File fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java@46 PS2, Line 46: public void setLastLoadedTimeMs(long timeMs) { > Yes, we need it in ImpaladCatalog. When impalad applies the update from cat thanks for the explanation! Done http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/IcebergEqualityDeleteTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergEqualityDeleteTable.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/IcebergEqualityDeleteTable.java@70 PS2, Line 70: > Good point! Done. Done http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java@66 PS2, Line 66: > Done Done http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@346 PS2, Line 346: catalogObject.getCatalog_version(), catalogObject.getLast_modified_time_ms()); > They are currently not used in impalad yet, e.g. not shown in the query pro Done http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@807 PS2, Line 807: resp.object_loaded_time_ms, new SqlConstraints(primaryKeys, foreignKeys), > Same as above, just set it for tables to be simple. Done http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/service/Frontend.java@2584 PS2, Line 2584: > Done Done http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/service/Frontend.java@2585 PS2, Line 2585: // Add the catalog versions and loaded timestamps. > I hope it can be parsed by tools like Workload Management. Maybe the curren I'm not sure about the Thrift/JSON profile. If we cannot use different formats, it's definitely better to be parseable in Thrift/JSON. Workload management reads from the underlying ClientRequestState object that was built for each query. As long as this information is accessible there (easy to add if not), then it can be read by workload management. -- To view, visit http://gerrit.cloudera.org:8080/21782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I94b2fd59ed5aca664d6db4448c61ad21a88a4f98 Gerrit-Change-Number: 21782 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 12 Sep 2024 19:15:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13182: Support uploading additional jars
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21556 ) Change subject: IMPALA-13182: Support uploading additional jars .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/21556 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica5fa4c0cd1a5c938f331f3a4bba85d4910db90e Gerrit-Change-Number: 21556 Gerrit-PatchSet: 5 Gerrit-Owner: gaurav singh Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: gaurav singh Gerrit-Comment-Date: Thu, 12 Sep 2024 18:32:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13182: Support uploading additional jars
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21556 ) Change subject: IMPALA-13182: Support uploading additional jars .. Patch Set 5: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21556 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica5fa4c0cd1a5c938f331f3a4bba85d4910db90e Gerrit-Change-Number: 21556 Gerrit-PatchSet: 5 Gerrit-Owner: gaurav singh Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: gaurav singh Gerrit-Comment-Date: Thu, 12 Sep 2024 17:54:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21782 ) Change subject: IMPALA-12876: Add catalogVersion and loaded timestamp in query profiles .. Patch Set 2: (7 comments) The Catalog.java file sets the last modified time on db, table, view, function, data source, and hdfs cache pool. However, only the table and view are actually reported in the profile. Do we need to set last modified time on the other object types? http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java File fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java@46 PS2, Line 46: public void setLastLoadedTimeMs(long timeMs) { Is this setter needed since mTimeMs_ is set in the "setCatalogVersion" function? http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/IcebergEqualityDeleteTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergEqualityDeleteTable.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/IcebergEqualityDeleteTable.java@70 PS2, Line 70: public long getCatalogVersion() { return 0; } Nit: these two functions could be overridden in the parent IcebergDeleteTable. http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java@66 PS2, Line 66: public long getCatalogVersion() { return 0; } Nit: these two functions could be overridden in the parent IcebergDeleteTable. http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@346 PS2, Line 346: catalogObject.getCatalog_version(), catalogObject.getLast_modified_time_ms()); Does last modified time also need to be set on db, function, data source, and hdfs cache pool like it is set on Catalog.getTCatalogObject? http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@807 PS2, Line 807: resp.object_loaded_time_ms, new SqlConstraints(primaryKeys, foreignKeys), Does last modified time also need to be set on db, function, data source, and hdfs cache pool? http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/service/Frontend.java@2584 PS2, Line 2584: // Also adds the catalog versions and loaded timestamps. Nit: add a blank line before this line and start the comment with // Add the catalog... http://gerrit.cloudera.org:8080/#/c/21782/2/fe/src/main/java/org/apache/impala/service/Frontend.java@2585 PS2, Line 2585: FrontendProfile.getCurrent().addInfoString("Table Versions", Can this profile text be a formatted table like ExecSummary is? -- To view, visit http://gerrit.cloudera.org:8080/21782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I94b2fd59ed5aca664d6db4448c61ad21a88a4f98 Gerrit-Change-Number: 21782 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 11 Sep 2024 15:37:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#49) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active and completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Testing was accomplished by adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. JUnit tests were also added to verify the new workload management columns are being correctly parsed. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management-fields.cc M be/src/service/workload-management-flags.cc M be/src/service/workload-management-init.cc M be/src/service/workload-management-test.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M be/src/util/CMakeLists.txt A be/src/util/string-join-test.cc A be/src/util/string-join.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java M testdata/workloads/functional-query/queries/QueryTest/workload-management-live.test M testdata/workloads/functional-query/queries/QueryTest/workload-management-log.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_test/test_observability.py M tests/util/workload_management.py 42 files changed, 2,134 insertions(+), 294 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/49 -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 49 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#48) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active and completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Testing was accomplished by adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. JUnit tests were also added to verify the new workload management columns are being correctly parsed. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management-fields.cc M be/src/service/workload-management-flags.cc M be/src/service/workload-management-init.cc M be/src/service/workload-management-test.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M be/src/util/CMakeLists.txt A be/src/util/string-join-test.cc A be/src/util/string-join.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java M testdata/workloads/functional-query/queries/QueryTest/workload-management-live.test M testdata/workloads/functional-query/queries/QueryTest/workload-management-log.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_test/test_observability.py M tests/util/workload_management.py 42 files changed, 2,096 insertions(+), 253 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/48 -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 48 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#46) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active and completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Testing was accomplished by adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. JUnit tests were also added to verify the new workload management columns are being correctly parsed. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management-fields.cc M be/src/service/workload-management-flags.cc M be/src/service/workload-management-init.cc M be/src/service/workload-management-test.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M be/src/util/CMakeLists.txt A be/src/util/string-join-test.cc A be/src/util/string-join.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java M testdata/workloads/functional-query/queries/QueryTest/workload-management-live.test M testdata/workloads/functional-query/queries/QueryTest/workload-management-log.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_test/test_observability.py M tests/util/workload_management.py 42 files changed, 2,111 insertions(+), 253 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/46 -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 46 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13182: Support uploading additional jars
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21556 ) Change subject: IMPALA-13182: Support uploading additional jars .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/21556/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21556/4//COMMIT_MSG@17 PS4, Line 17: * Tested manually Please add details about how the manual tests were run. For example, how did you assert that Impala can see the additional JARs on the classpath and use classes from those JARs? http://gerrit.cloudera.org:8080/#/c/21556/4/docker/daemon_entrypoint.sh File docker/daemon_entrypoint.sh: http://gerrit.cloudera.org:8080/#/c/21556/4/docker/daemon_entrypoint.sh@117 PS4, Line 117: /opt/impala/aux-jars Impala has a bias towards high configurability with sensible defaults. It would be preferable to make this path configurable with '/opt/impala/aux-jars' being the default. http://gerrit.cloudera.org:8080/#/c/21556/4/docker/daemon_entrypoint.sh@120 PS4, Line 120: CLASSPATH+=:$jar > Good point. If the PATH variable already has a ':' at the end, then this co Ah, yes, the previous code ensures the CLASSPATH does not end with a colon. -- To view, visit http://gerrit.cloudera.org:8080/21556 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica5fa4c0cd1a5c938f331f3a4bba85d4910db90e Gerrit-Change-Number: 21556 Gerrit-PatchSet: 4 Gerrit-Owner: gaurav singh Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: gaurav singh Gerrit-Comment-Date: Fri, 06 Sep 2024 17:34:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread
Jason Fehr has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21744 ) Change subject: IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread .. IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread The workload management processing runs in a separate thread declared in impala-server.h. This thread runs until a graceful shutdown is initiated. The last step of the Impala coordinator shutdown process is to drain the completed queries queue to the query log table thus ensuring completed queries do not get lost. This thread has to run to completion, but the coordinator shutdown process never joins that thread. This patch adds the joining of that thread during the coordinator shutdown process. If the workload management shutdown process exceedes the allotted time, the thread is detached. Info level logging was added to indicate which completed queries queue drain situation occurred - successful or timed out. A new custom cluster test was added to test the situation where the completed queries queue drain process times out. Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e Reviewed-on: http://gerrit.cloudera.org:8080/21744 Reviewed-by: Riza Suminto Reviewed-by: Michael Smith Tested-by: Impala Public Jenkins --- M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/workload-management-init.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M tests/custom_cluster/test_query_log.py 6 files changed, 108 insertions(+), 17 deletions(-) Approvals: Riza Suminto: Looks good to me, approved Michael Smith: Looks good to me, but someone else must approve Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/21744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e Gerrit-Change-Number: 21744 Gerrit-PatchSet: 10 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21744 ) Change subject: IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/21744/8/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/21744/8/be/src/service/workload-management.cc@134 PS8, Line 134: case NOT_STARTED: : // Thread was never started, joining or detaching will cause error. : break; > This can be removed? yes -- To view, visit http://gerrit.cloudera.org:8080/21744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e Gerrit-Change-Number: 21744 Gerrit-PatchSet: 8 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 04 Sep 2024 23:56:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread
Hello Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21744 to look at the new patch set (#9). Change subject: IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread .. IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread The workload management processing runs in a separate thread declared in impala-server.h. This thread runs until a graceful shutdown is initiated. The last step of the Impala coordinator shutdown process is to drain the completed queries queue to the query log table thus ensuring completed queries do not get lost. This thread has to run to completion, but the coordinator shutdown process never joins that thread. This patch adds the joining of that thread during the coordinator shutdown process. If the workload management shutdown process exceedes the allotted time, the thread is detached. Info level logging was added to indicate which completed queries queue drain situation occurred - successful or timed out. A new custom cluster test was added to test the situation where the completed queries queue drain process times out. Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e --- M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/workload-management-init.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M tests/custom_cluster/test_query_log.py 6 files changed, 108 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/21744/9 -- To view, visit http://gerrit.cloudera.org:8080/21744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e Gerrit-Change-Number: 21744 Gerrit-PatchSet: 9 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread
Hello Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21744 to look at the new patch set (#8). Change subject: IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread .. IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread The workload management processing runs in a separate thread declared in impala-server.h. This thread runs until a graceful shutdown is initiated. The last step of the Impala coordinator shutdown process is to drain the completed queries queue to the query log table thus ensuring completed queries do not get lost. This thread has to run to completion, but the coordinator shutdown process never joins that thread. This patch adds the joining of that thread during the coordinator shutdown process. If the workload management shutdown process exceedes the allotted time, the thread is detached. Info level logging was added to indicate which completed queries queue drain situation occurred - successful or timed out. A new custom cluster test was added to test the situation where the completed queries queue drain process times out. Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e --- M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/workload-management-init.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M tests/custom_cluster/test_query_log.py 6 files changed, 111 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/21744/8 -- To view, visit http://gerrit.cloudera.org:8080/21744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e Gerrit-Change-Number: 21744 Gerrit-PatchSet: 8 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13350: Fix Workload Management flush on interval Test
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21750 ) Change subject: IMPALA-13350: Fix Workload Management flush_on_interval Test .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21750/1/tests/custom_cluster/test_query_log.py File tests/custom_cluster/test_query_log.py: http://gerrit.cloudera.org:8080/#/c/21750/1/tests/custom_cluster/test_query_log.py@688 PS1, Line 688: 20 > nit: I think double of query_log_write_interval_s for timeout value should I considered allowing for double the value of query_log_write_interval_s (which would be 30 seconds instead of 20) and ended up deciding that the extra time did not add any value considering the query execution will have already resulted in 10 seconds passing (10 queries with each one sleeping 1 second). -- To view, visit http://gerrit.cloudera.org:8080/21750 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2fb1034ca63e170d5e57a6ece9b47da5dafebff4 Gerrit-Change-Number: 21750 Gerrit-PatchSet: 1 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 04 Sep 2024 16:30:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21744 ) Change subject: IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread .. Patch Set 5: (7 comments) http://gerrit.cloudera.org:8080/#/c/21744/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21744/2//COMMIT_MSG@10 PS2, Line 10: in impala-server.h. This thread runs until a graceful shutdown is > nit: unnecessary double space. Same on the next line. Done http://gerrit.cloudera.org:8080/#/c/21744/2//COMMIT_MSG@11 PS2, Line 11: initiate > nit: initiated Done http://gerrit.cloudera.org:8080/#/c/21744/2/be/src/service/workload-management.h File be/src/service/workload-management.h: http://gerrit.cloudera.org:8080/#/c/21744/2/be/src/service/workload-management.h@101 PS2, Line 101: kload management in > nit: workload management setup Done http://gerrit.cloudera.org:8080/#/c/21744/2/be/src/service/workload-management.h@108 PS2, Line 108: finished o > nit: finished or Done http://gerrit.cloudera.org:8080/#/c/21744/2/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/21744/2/be/src/service/workload-management.cc@118 PS2, Line 118: completed_queries_cv_.notify_all(); > I'm not quite sure how this works. It holds workload_mgmt_threadstate_mu_ a The condition variable releases the unique_lock while it's waiting. It retakes the unique_lock before the predicate function runs and releases it after the predicate completes. Once the wait finishes, it takes the unique_lock before return ing from the wait_for() function. Thus, the workload_mgmt_thread_state_ can be updated only while the condition variable is waiting. http://gerrit.cloudera.org:8080/#/c/21744/2/be/src/service/workload-management.cc@127 PS2, Line 127: se NOT_STARTED: > Add DCHECK not nullptr. Done http://gerrit.cloudera.org:8080/#/c/21744/2/be/src/service/workload-management.cc@130 PS2, Line 130: se SHUTDOWN: > Will we miss logging some queries into query log table if this happen? Yes. I added logging of the completed queries queued metric which will give a maximum number of queries lost. I don't want to check the in-memory queue size because taking its lock may cause a hang at the code that timed out could still be running and holding the lock. -- To view, visit http://gerrit.cloudera.org:8080/21744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e Gerrit-Change-Number: 21744 Gerrit-PatchSet: 5 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 04 Sep 2024 00:23:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread
Hello Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21744 to look at the new patch set (#5). Change subject: IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread .. IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread The workload management processing runs in a separate thread declared in impala-server.h. This thread runs until a graceful shutdown is initiated. The last step of the Impala coordinator shutdown process is to drain the completed queries queue to the query log table thus ensuring completed queries do not get lost. This thread has to run to completion, but the coordinator shutdown process never joins that thread. This patch adds the joining of that thread during the coordinator shutdown process. If the workload management shutdown process exceedes the allotted time, the thread is detached. Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e --- M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/workload-management-init.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M tests/custom_cluster/test_query_log.py 6 files changed, 97 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/21744/5 -- To view, visit http://gerrit.cloudera.org:8080/21744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e Gerrit-Change-Number: 21744 Gerrit-PatchSet: 5 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread
Hello Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21744 to look at the new patch set (#4). Change subject: IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread .. IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread The workload management processing runs in a separate thread declared in impala-server.h. This thread runs until a graceful shutdown is intiated. The last step of the Impala coordinator shutdown process is to drain the completed queries queue to the query log table thus ensuring completed queries do not get lost. This thread has to run to completion, but the coordinator shutdown process never joins that thread. This patch adds the joining of that thread during the coordinator shutdown process. If the workload management shutdown process exceedes the allotted time, the thread is detached. Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e --- M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/workload-management-init.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h 5 files changed, 41 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/21744/4 -- To view, visit http://gerrit.cloudera.org:8080/21744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e Gerrit-Change-Number: 21744 Gerrit-PatchSet: 4 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread
Hello Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21744 to look at the new patch set (#3). Change subject: IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread .. IMPALA-13347: Fixes TSAN Thread Leak of Workload Management Thread The workload management processing runs in a separate thread declared in impala-server.h. This thread runs until a graceful shutdown is intiated. The last step of the Impala coordinator shutdown process is to drain the completed queries queue to the query log table thus ensuring completed queries do not get lost. This thread has to run to completion, but the coordinator shutdown process never joins that thread. This patch adds the joining of that thread during the coordinator shutdown process. If the workload management shutdown process exceedes the allotted time, the thread is detached. Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e --- M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/workload-management-init.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h 5 files changed, 41 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/21744/3 -- To view, visit http://gerrit.cloudera.org:8080/21744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e Gerrit-Change-Number: 21744 Gerrit-PatchSet: 3 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13347: Fixes TSAN Thread Leak Caused Workload Management Thread
Hello Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21744 to look at the new patch set (#2). Change subject: IMPALA-13347: Fixes TSAN Thread Leak Caused Workload Management Thread .. IMPALA-13347: Fixes TSAN Thread Leak Caused Workload Management Thread The workload management processing runs in a separate thread declared in impala-server.h. This thread runs until a graceful shutdown is intiated. The last step of the Impala coordinator shutdown process is to drain the completed queries queue to the query log table thus ensuring completed queries do not get lost. This thread has to run to completion, but the coordinator shutdown process never joins that thread. This patch adds the joining of that thread during the coordinator shutdown process. If the workload management shutdown process exceedes the allotted time, the thread is detached. Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e --- M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/workload-management-init.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h 5 files changed, 41 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/21744/2 -- To view, visit http://gerrit.cloudera.org:8080/21744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e Gerrit-Change-Number: 21744 Gerrit-PatchSet: 2 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13350: Fixes TSAN Thread Leak Caused Workload Management Thread
Jason Fehr has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21744 Change subject: IMPALA-13350: Fixes TSAN Thread Leak Caused Workload Management Thread .. IMPALA-13350: Fixes TSAN Thread Leak Caused Workload Management Thread The workload management processing runs in a separate thread declared in impala-server.h. This thread runs until a graceful shutdown is intiated. The last step of the Impala coordinator shutdown process is to drain the completed queries queue to the query log table thus ensuring completed queries do not get lost. This thread has to run to completion, but the coordinator shutdown process never joins that thread. This patch adds the joining of that thread during the coordinator shutdown process. If the workload management shutdown process exceedes the allotted time, the thread is detached. Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e --- M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/workload-management-init.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h 5 files changed, 41 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/21744/1 -- To view, visit http://gerrit.cloudera.org:8080/21744 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1e95967bb6e04470a8900c9ba69080eea8aaa25e Gerrit-Change-Number: 21744 Gerrit-PatchSet: 1 Gerrit-Owner: Jason Fehr
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#45) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active and completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Testing was accomplished by adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. JUnit tests were also added to verify the new workload management columns are being correctly parsed. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management-fields.cc M be/src/service/workload-management-flags.cc M be/src/service/workload-management-init.cc M be/src/service/workload-management-test.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M be/src/util/CMakeLists.txt A be/src/util/string-join-test.cc A be/src/util/string-join.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java M testdata/workloads/functional-query/queries/QueryTest/workload-management-live.test M testdata/workloads/functional-query/queries/QueryTest/workload-management-log.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_test/test_observability.py M tests/util/workload_management.py 42 files changed, 2,101 insertions(+), 247 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/45 -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 45 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#44) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active and completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Testing was accomplished by adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. JUnit tests were also added to verify the new workload management columns are being correctly parsed. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management-fields.cc M be/src/service/workload-management-flags.cc M be/src/service/workload-management-init.cc M be/src/service/workload-management-test.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M be/src/util/CMakeLists.txt A be/src/util/string-join-test.cc A be/src/util/string-join.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java M testdata/workloads/functional-query/queries/QueryTest/workload-management-live.test M testdata/workloads/functional-query/queries/QueryTest/workload-management-log.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_test/test_observability.py M tests/util/workload_management.py 42 files changed, 2,101 insertions(+), 246 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/44 -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 44 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#43) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active and completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Testing was accomplished by adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. JUnit tests were also added to verify the new workload management columns are being correctly parsed. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management-fields.cc M be/src/service/workload-management-flags.cc M be/src/service/workload-management-init.cc M be/src/service/workload-management-test.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M be/src/util/CMakeLists.txt M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java M testdata/workloads/functional-query/queries/QueryTest/workload-management-live.test M testdata/workloads/functional-query/queries/QueryTest/workload-management-log.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_test/test_observability.py M tests/util/workload_management.py 40 files changed, 1,956 insertions(+), 246 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/43 -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 43 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 40: (3 comments) http://gerrit.cloudera.org:8080/#/c/21142/41/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/21142/41/be/src/service/impala-server.h@1661 PS41, Line 1661: /// Note that these hold a shared pointer to 'this', and so need to be reset() > What maintains the lifetime of the InternalServer? I guess it's always a re Good point. It's not needed at all since ImpalaServer implements the InternalServer interface. I removed the internal_server_ variable. http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management-init.cc File be/src/service/workload-management-init.cc: http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management-init.cc@354 PS40, Line 354: // errors when running the alter table statements. > What happens when it does fail? Do you have to restart the coordinator? Is If it fails, the coordinator exits with an error via the ABORT_IF_ERROR function. An external process will need to restart the coordinator. The potential for failure only exists when upgrading from one version of workload management to another. I will test moving the jitter to the upgrade code path. http://gerrit.cloudera.org:8080/#/c/21142/41/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/21142/41/be/src/service/workload-management.cc@65 PS41, Line 65: /// > What was the argument for moving these out of ImpalaServer and making them These variables are only used in this file, thus no need to make them available to in the ImpalaServer private scope. -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 40 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 29 Aug 2024 19:05:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 42: (13 comments) http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management-flags.cc File be/src/service/workload-management-flags.cc: http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management-flags.cc@174 PS40, Line 174: workload_mgmt_schema_version, "1.1.0" > Question: if there is a bug exist in a specific version (mistake in collect No, it is not possible to downgrade. Upgrades are one-way only. I'm going to investigate adding a flag to override the latest schema version though specifically for this case where a bug exists in the workload management code. http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management-init.cc File be/src/service/workload-management-init.cc: http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management-init.cc@70 PS40, Line 70: > This shouldn't need to be a shared_ptr. It's not held beyond the lifetime o Very good suggestion. I switched the class level internal_server_ variable from a shared_ptr to InternalServer* since ownership is never shared or transferred. http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management-init.cc@103 PS40, Line 103: > Doesn't need to be a shared_ptr. Done http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management-init.cc@152 PS40, Line 152: > Doesn't need to be a shared_ptr. Done http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management-init.cc@354 PS40, Line 354: // concurrently on all coordinators. Running the same create and alter table statements > Could we do all this setup in catalogd instead? Trying to avoid errors by j It's not my favorite approach either. I've looked some into using the catalog, but that route requires implementing a completely new concept in the catalog which I hesitate to do in the catalog. I will look some more into that and will also think more if there is a deterministic way we can determine how long each coordinator sleeps. http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management-init.cc@387 PS40, Line 387: if (auto v = KNOWN_VERSIONS.find(target_schema_version); v == KNOWN_VERSIONS.end()) { > Might be nice to list the known versions. Done http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/21142/40/be/src/service/workload-management.cc@210 PS40, Line 210: } > It's not clear to me if this needs to be held through the whole function. C Good question, the need for this mutex is important for coordinating the shutdown process to ensure all queued completed queries are inserted into the query log table before the coordinator process ends. The ImpalaServer::ShutdownWorkloadManagement function in workload-management.cc uses a condition_variable to block the coordinator shutdown process while the completed queries queue is drained. Additionally, thus mutex ensures no invalid state transitions and no data corruption occurs during the situation where coordinator startup is interrupted by a shutdown. I did some refactoring on the namings to clarify their purposes and also found a bug where the shutdown process was not releasing its lock before notifying the condition_variable. http://gerrit.cloudera.org:8080/#/c/21142/40/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/21142/40/fe/src/main/java/org/apache/impala/analysis/Expr.java@2041 PS40, Line 2041: recordChildrenInWork > 'isCandidateForQueryLog' might fit better. Agreed this function name is not clear enough. Renamed. http://gerrit.cloudera.org:8080/#/c/21142/40/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/21142/40/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@380 PS40, Line 380: registerReferencedColumns(); > Why guard it? They're also added to the profile. Good point, I had forgotten about the profile. http://gerrit.cloudera.org:8080/#/c/21142/40/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@387 PS40, Line 387: selectList_.getItems().stream().filter(elem -> !elem.isStar()) > Why are column aliases not resolved? Good catch, that is an out-of-date comment and has been removed. http://gerrit.cloudera.org:8080/#/c/21142/40/fe/src/test/java/org/apache/impala/planner/ColumnsTest.java File fe/src/test/java/org/apache/impala/planner/ColumnsTest.java: http://gerrit.cloudera.org:8080/#/c/21142/40/fe/src/test/java/org/apache/impala/planner/Colu
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#41) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active and completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Testing was accomplished by adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. JUnit tests were also added to verify the new workload management columns are being correctly parsed. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management-fields.cc M be/src/service/workload-management-flags.cc M be/src/service/workload-management-init.cc M be/src/service/workload-management-test.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M be/src/util/CMakeLists.txt M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java M testdata/workloads/functional-query/queries/QueryTest/workload-management-live.test M testdata/workloads/functional-query/queries/QueryTest/workload-management-log.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_test/test_observability.py M tests/util/workload_management.py 40 files changed, 1,958 insertions(+), 243 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/41 -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 41 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-13328: Fix missing krb5-config in building impala quickstart client docker image
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21725 ) Change subject: IMPALA-13328: Fix missing krb5-config in building impala_quickstart_client docker image .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ieaa9878fa9cd9902ac883866c82e224889940615 Gerrit-Change-Number: 21725 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 28 Aug 2024 18:13:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12737: Refactor the Workload Management Initialization Process.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21653 ) Change subject: IMPALA-12737: Refactor the Workload Management Initialization Process. .. Patch Set 37: Type in the previous comment: should be "nits" instead of "nots" -- To view, visit http://gerrit.cloudera.org:8080/21653 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id645f94c8da73b91c13a23d7ac0ea026425f0f96 Gerrit-Change-Number: 21653 Gerrit-PatchSet: 37 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 28 Aug 2024 17:45:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. Patch Set 38: (5 comments) http://gerrit.cloudera.org:8080/#/c/21142/38/be/src/service/workload-management-init.cc File be/src/service/workload-management-init.cc: http://gerrit.cloudera.org:8080/#/c/21142/38/be/src/service/workload-management-init.cc@166 PS38, Line 166: cols_to_add << "ALTER TABLE " << table_name << " ADD IF NOT EXISTS COLUMNS("; : _appendCols(cols_to_add, [current_version, target_version](const FieldDefinition& f){ : return f.schema_version > current_version && f.schema_version <= target_version; }); : cols_to_add << ")"; > This seems OK if version upgrade is only adding new columns. But what if in My original implementation had the upgrade broken out by version. I moved away from that implementation because I prefer this more generic approach that will work for any version upgrade provided each upgrade is only adding columns. I want to keep this code generic for now but, in the future, will break out the upgrades into their own function per version should that be needed. http://gerrit.cloudera.org:8080/#/c/21142/38/be/src/statestore/statestore.h File be/src/statestore/statestore.h: http://gerrit.cloudera.org:8080/#/c/21142/38/be/src/statestore/statestore.h@242 PS38, Line 242: /// Topic coordinating the initialization of workload management on all coordinators. : static const char* IMPALA_WORKLOAD_MANAGEMENT_TOPIC; > This is unused now? Good catch. Not sure how this snuck back in, most likely due to a conflict when rebasing onto https://gerrit.cloudera.org/#/c/21653/ Removed here and in statestore.cc http://gerrit.cloudera.org:8080/#/c/21142/38/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/21142/38/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@4662 PS38, Line 4662: sourceExpr instanceof FunctionCallExpr : || sourceExpr instanceof CaseExpr : || sourceExpr instanceof ArithmeticExpr : || sourceExpr instanceof CastExpr > Why this check is limited to these four Expr subclass only? I tested this by adding an else clause and running the entire tpcds suite of queries through. These four were the only Expr subclasses that warranted special handling. The reason these four are called out is they have child slot refs in the form of arguments that need to be included. For example, give the query 'select min(mycol)', min is an instance of the FunctionCallExpr class with child slot refs of 'mycol' which we need to record as being used in the select columns. I like the idea of adding a method that can be called instead of having an arbitrary list here. I have added such a method. http://gerrit.cloudera.org:8080/#/c/21142/38/fe/src/main/java/org/apache/impala/analysis/Path.java File fe/src/main/java/org/apache/impala/analysis/Path.java: http://gerrit.cloudera.org:8080/#/c/21142/38/fe/src/main/java/org/apache/impala/analysis/Path.java@442 PS38, Line 442: public List getFullyQualifiedRawPath(boolean preferAlias) > Please add documentation and describe what preferAlias means. Done http://gerrit.cloudera.org:8080/#/c/21142/38/fe/src/main/java/org/apache/impala/analysis/Path.java@450 PS38, Line 450: rootTable_ instanceof IcebergMetadataTable > Why is this changed from VirtualTable? The ColumnsTest junits revealed that Iceberg delete tables caused an error here because the IcebergDeleteTable class is an instanceof VirtualTable but not IcebergMetadataTable, thus the cast threw an exception. -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 38 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 28 Aug 2024 17:49:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#40) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active and completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Testing was accomplished by adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. JUnit tests were also added to verify the new workload management columns are being correctly parsed. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management-fields.cc M be/src/service/workload-management-flags.cc M be/src/service/workload-management-init.cc M be/src/service/workload-management-test.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java M testdata/workloads/functional-query/queries/QueryTest/workload-management-live.test M testdata/workloads/functional-query/queries/QueryTest/workload-management-log.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_test/test_observability.py M tests/util/workload_management.py 37 files changed, 1,880 insertions(+), 165 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/40 -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 40 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12737: Refactor the Workload Management Initialization Process.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21653 ) Change subject: IMPALA-12737: Refactor the Workload Management Initialization Process. .. Patch Set 37: (1 comment) For the includes nots, the Impala code standards say to include what you use and not rely on transitive includes: https://google.github.io/styleguide/cppguide.html#Include_What_You_Use http://gerrit.cloudera.org:8080/#/c/21653/37/be/src/service/workload-management-init.cc File be/src/service/workload-management-init.cc: http://gerrit.cloudera.org:8080/#/c/21653/37/be/src/service/workload-management-init.cc@29 PS37, Line 29: #include > nit: unused This does get used on a later patch. -- To view, visit http://gerrit.cloudera.org:8080/21653 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id645f94c8da73b91c13a23d7ac0ea026425f0f96 Gerrit-Change-Number: 21653 Gerrit-PatchSet: 37 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 28 Aug 2024 17:44:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Refactor the Workload Management Initialization Process.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21653 ) Change subject: IMPALA-12737: Refactor the Workload Management Initialization Process. .. Patch Set 34: (1 comment) http://gerrit.cloudera.org:8080/#/c/21653/36/be/src/service/workload-management-fields.cc File be/src/service/workload-management-fields.cc: http://gerrit.cloudera.org:8080/#/c/21653/36/be/src/service/workload-management-fields.cc@35 PS36, Line 35: #include "runtime/exec-env.h" > Why was this needed? That is where the "TPrimitiveType" struct is declared. -- To view, visit http://gerrit.cloudera.org:8080/21653 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id645f94c8da73b91c13a23d7ac0ea026425f0f96 Gerrit-Change-Number: 21653 Gerrit-PatchSet: 34 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 28 Aug 2024 16:58:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Refactor the Workload Management Initialization Process.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21653 ) Change subject: IMPALA-12737: Refactor the Workload Management Initialization Process. .. Patch Set 36: Verified-1 -- To view, visit http://gerrit.cloudera.org:8080/21653 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id645f94c8da73b91c13a23d7ac0ea026425f0f96 Gerrit-Change-Number: 21653 Gerrit-PatchSet: 36 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 28 Aug 2024 16:55:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12737: Refactor the Workload Management Initialization Process.
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21653 ) Change subject: IMPALA-12737: Refactor the Workload Management Initialization Process. .. Patch Set 36: (1 comment) http://gerrit.cloudera.org:8080/#/c/21653/34/be/src/service/workload-management.h File be/src/service/workload-management.h: http://gerrit.cloudera.org:8080/#/c/21653/34/be/src/service/workload-management.h@81 PS34, Line 81: > Or if we're going to cache it, do the lookup ourselves and save it as a con I like the idea of completely removing the function and leveraging the existing << operator. Going that route. -- To view, visit http://gerrit.cloudera.org:8080/21653 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id645f94c8da73b91c13a23d7ac0ea026425f0f96 Gerrit-Change-Number: 21653 Gerrit-PatchSet: 36 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 28 Aug 2024 16:55:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12737: Refactor the Workload Management Initialization Process.
Hello Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21653 to look at the new patch set (#36). Change subject: IMPALA-12737: Refactor the Workload Management Initialization Process. .. IMPALA-12737: Refactor the Workload Management Initialization Process. The workload management initialization process creates the two tables "sys.impala_query_log" and "sys.impala_query_live" during coordinator startup. The current design for this init process is to create both tables on each coordinator at every startup by running create database and create table if not exists DDLs. This design causes unnecessary DDLs to execute which delays coordinator startup and introduces the potential for unnecessary startup failures should the DDLs fail. This patch splits the initialization code into its own file and adds version tracking to the individual fields in the workload management tables. This patch also adds schema version checks on the workload management tables and only runs DDLs for the db tables if necessary. Additionally, versioning of workload management table schemas is introduced. The only allowed schema version in this patch is 1.0.0. Future patches that need to modify the workload management table schema will expand this list of allowed versions. Since this patch is a refactor and does not change functionality, testing was accomplished by running existing workload management unit and python tests. Change-Id: Id645f94c8da73b91c13a23d7ac0ea026425f0f96 --- M be/src/service/CMakeLists.txt M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/workload-management-fields.cc M be/src/service/workload-management-flags.cc A be/src/service/workload-management-init.cc A be/src/service/workload-management-test.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M be/src/util/CMakeLists.txt A be/src/util/version-util-test.cc A be/src/util/version-util.cc A be/src/util/version-util.h M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py 15 files changed, 797 insertions(+), 259 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/21653/36 -- To view, visit http://gerrit.cloudera.org:8080/21653 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id645f94c8da73b91c13a23d7ac0ea026425f0f96 Gerrit-Change-Number: 21653 Gerrit-PatchSet: 36 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12737: Refactor the Workload Management Initialization Process.
Hello Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21653 to look at the new patch set (#35). Change subject: IMPALA-12737: Refactor the Workload Management Initialization Process. .. IMPALA-12737: Refactor the Workload Management Initialization Process. The workload management initialization process creates the two tables "sys.impala_query_log" and "sys.impala_query_live" during coordinator startup. The current design for this init process is to create both tables on each coordinator at every startup by running create database and create table if not exists DDLs. This design causes unnecessary DDLs to execute which delays coordinator startup and introduces the potential for unnecessary startup failures should the DDLs fail. This patch splits the initialization code into its own file and adds version tracking to the individual fields in the workload management tables. This patch also adds schema version checks on the workload management tables and only runs DDLs for the db tables if necessary. Additionally, versioning of workload management table schemas is introduced. The only allowed schema version in this patch is 1.0.0. Future patches that need to modify the workload management table schema will expand this list of allowed versions. Since this patch is a refactor and does not change functionality, testing was accomplished by running existing workload management unit and python tests. Change-Id: Id645f94c8da73b91c13a23d7ac0ea026425f0f96 --- M be/src/service/CMakeLists.txt M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/workload-management-fields.cc M be/src/service/workload-management-flags.cc A be/src/service/workload-management-init.cc A be/src/service/workload-management-test.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M be/src/util/CMakeLists.txt A be/src/util/version-util-test.cc A be/src/util/version-util.cc A be/src/util/version-util.h M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py 15 files changed, 797 insertions(+), 259 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/21653/35 -- To view, visit http://gerrit.cloudera.org:8080/21653 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id645f94c8da73b91c13a23d7ac0ea026425f0f96 Gerrit-Change-Number: 21653 Gerrit-PatchSet: 35 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12737: Query columns in workload management tables.
Jason Fehr has uploaded a new patch set (#39) to the change originally created by Michael Smith. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: Query columns in workload management tables. .. IMPALA-12737: Query columns in workload management tables. Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate Columns", and "OrderBy Columns" to the query profile and the workload management active and completed queries tables. These fields are presented as comma separate strings containing the fully qualified column name in the format database.table_name.column_name. Aggregate columns include all columns in the order by and having clauses. Since new columns are being added, the workload management init process is also being modified to allow for one-way upgrades of the table schemas if necessary. Testing was accomplished by adding/updating workload and custom cluster tests to assert the new columns and the workload management upgrade process. JUnit tests were also added to verify the new workload management columns are being correctly parsed. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management-fields.cc M be/src/service/workload-management-flags.cc M be/src/service/workload-management-init.cc M be/src/service/workload-management-test.cc M be/src/service/workload-management.cc M be/src/service/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/test/java/org/apache/impala/planner/ColumnsTest.java M testdata/workloads/functional-query/queries/QueryTest/workload-management-live.test M testdata/workloads/functional-query/queries/QueryTest/workload-management-log.test M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/common/impala_test_suite.py M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py A tests/custom_cluster/test_workload_mgmt_init.py A tests/custom_cluster/test_workload_mgmt_sql_details.py M tests/query_test/test_observability.py M tests/util/workload_management.py 37 files changed, 1,835 insertions(+), 118 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/39 -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 39 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto