[Impala-ASF-CR] [WIP] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21143 ) Change subject: [WIP] IMPALA-12856: Event processor should ignore processing partition with empty partition values .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15496/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 13 Mar 2024 05:23:46 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Sai Hemanth Gantasala has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21143 Change subject: [WIP] IMPALA-12856: Event processor should ignore processing partition with empty partition values .. [WIP] IMPALA-12856: Event processor should ignore processing partition with empty partition values While processing partition related events, Event Processor (EP) is facing IllegalStateException if the partition fetched from HMS has empty partition values. Even though this is a bug in HMS which returns partitions with empty values, EP should ignore such partitions instead of throwing IllegalStateException. Note: Added a hidden boolean config is_return_empty_partition_values to add malformed partition objects. Testing: - Manually verified the test provided in jira details in local env. - Added unit test to return empty partition values and EP state. Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 7 files changed, 60 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/1 -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI
k.venureddy2...@gmail.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/20986 ) Change subject: IMPALA-12782: Show info of the event processing in /events webUI .. Patch Set 6: Code-Review+1 (1 comment) Looks good to me. Just have one comment. http://gerrit.cloudera.org:8080/#/c/20986/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/20986/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1192 PS3, Line 1192: NotificationEvent e = events.get(events.size() - 1); > Nice catch! Sorry i didn't realise that resetProgress() was called in finally. It is not required to call resetProgress() here. -- To view, visit http://gerrit.cloudera.org:8080/20986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c Gerrit-Change-Number: 20986 Gerrit-PatchSet: 6 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 13 Mar 2024 04:57:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 ) Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10374/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 Gerrit-Change-Number: 21141 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 13 Mar 2024 03:09:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 ) Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15495/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 Gerrit-Change-Number: 21141 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 13 Mar 2024 03:05:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 ) Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table .. Patch Set 3: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java File fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java: http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@36 PS1, Line 36: // Max number of rows processed across all instances of a plan node. : private long maxRowsProcessed_ = 0; : : // Max number of rows processed per backend impala daemon for a plan node. : private long maxRowsProcessedPerNode_ > fixed Done http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@49 PS1, Line 49: // Operations on DataSourceScanNode are processed on coordinator. : if (fragment == null) { : numNodes = ((DataSourceScanNode)caller).getNumNodes(); > fixed as suggested Done http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py File tests/custom_cluster/test_ext_data_sources.py: http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py@29 PS2, Line 29: : : class TestExtDataSources(CustomClusterTestSuite): > Fixed as suggested. Thanks Done -- To view, visit http://gerrit.cloudera.org:8080/21141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 Gerrit-Change-Number: 21141 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 13 Mar 2024 02:55:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 ) Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@809 PS1, Line 809: Exception > nit: I'd suggest to narrow it down to subclass of Exception that is > actually thrown, but I guess this is fine too to catch all. I will keep the code to catch all exception. http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py File tests/custom_cluster/test_ext_data_sources.py: http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py@29 PS2, Line 29: : : class TestExtDataSources(CustomClusterTestSuite): > You can add_test_dimension to apply exec_single_node_rows_threshold = 100 f Fixed as suggested. Thanks -- To view, visit http://gerrit.cloudera.org:8080/21141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 Gerrit-Change-Number: 21141 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 13 Mar 2024 02:42:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table
Wenzhe Zhou has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/21141 ) Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table .. IMPALA-12896: Avoid JDBC table to be set as transactional table In some deployment environment, JDBC tables are set as transactional tables by default. This causes catalogd failed to load the metadata for JDBC tables. This patch explicitly add table properties with "transactional=false" for JDBC table to avoid the JDBC to be set as transactional table. The operations on JDBC table are processed only on coordinator. The processed rows should be estimated as 0 for DataSourceScanNode by planner so that coordinator-only query plans are generated for simple queries on JDBC tables and queries could be executed without invoking executor nodes. Also adds Preconditions.check to make sure numNodes equals 1 for DataSourceScanNode. Updates FileSystemUtil.copyFileFromUriToLocal() function to write log message for all types of exceptions. Testing: - Fixed planer tests for data source tables. - Ran end-to-end tests of JDBC tables with query option 'exec_single_node_rows_threshold' as default value 100. - Passed core-tests. Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 --- M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/small-query-opt.test M tests/custom_cluster/test_ext_data_sources.py M tests/query_test/test_ext_data_sources.py 7 files changed, 42 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/21141/3 -- To view, visit http://gerrit.cloudera.org:8080/21141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 Gerrit-Change-Number: 21141 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-12264: Add limit on number of HS2 sessions per user.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/21128 ) Change subject: IMPALA-12264: Add limit on number of HS2 sessions per user. .. Patch Set 1: CDPD-58529 / IMPALA-12264 requests that we should “prevent a rogue application from repeatedly connecting to and monopolizing Impala”, and suggests “we can add configurations to limit concurrent connections”. The structure of thrift is such that it is hard to limit actual connections. For now the plan is to limit the number of HS2 sessions per user. This is a fairly straightforward change that I believe implements the main idea which is to prevent rogue applications as the jira says. As with some other features (e.g. -disconnected_session_timeout) this is just implemented for HiveServer2. As Beeswax is now deprecated it is suggested that customers disable beeswax access for clients by setting the --beeswax_port (Impala Daemon Beeswax Port) flag to 0. -- To view, visit http://gerrit.cloudera.org:8080/21128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd28edc352102d89774f6ece5376e7c79ae41aa8 Gerrit-Change-Number: 21128 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 13 Mar 2024 01:10:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15494/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 13 Mar 2024 00:52:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1785 PS4, Line 1785: skipFileMetadata > nit: "skipFileMetadataReload" sounds better Done http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1792 PS4, Line 1792: if (!(isNonTrivialSdPropsChanged(beforeTable.getSd(), afterTable.getSd( { : skipFileMetadata = true; : } else { : skipFileMetadata = false; : } > This means skipFileMetadata = !isNonTrivialSdPropsChanged() regardless of w I'm reverting back to patchset 2's logic of skipping file metadata. Reason being, it is not common to change two different things in the same alter table query. I have tried different combinations of tblproperties/Sd properties/schema change and owner change in the same alter table query and I cannot seem to find the right logic that always works. So I think users would have to set file_metadata_reload="" if two options are changing in single alter table statement. http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1860 PS4, Line 1860: file metadata reload can be skipped > Shouldn't this be "file metadata should be reloaded"? Sorry my bad. http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1873 PS4, Line 1873: table schema reload can be skipped > nit: "file metadata should be reloaded" Ack http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@a3035 PS4, Line 3035: > Why do we remove existing tests in testAlterTableNoFileMetadataReload()? Good catch. This is not intentional. I'll revert it back. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 13 Mar 2024 00:29:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#6). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories=true. The file metadata reload can be skipped for all other changes in SD. Also introduced a small optimization to skip reloading of table schema when ALTER_TABLE changes location, rowformat, fileformat, and serde in the Storage Descriptor. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 149 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/6 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/20986 ) Change subject: IMPALA-12782: Show info of the event processing in /events webUI .. Patch Set 6: Code-Review+1 (3 comments) http://gerrit.cloudera.org:8080/#/c/20986/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/20986/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1234 PS6, Line 1234: logEventMetrics(eventProcessingTime, elapsedNs); Nit: we could probably remove L#1195 and L#1226 and add that here. http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl File www/events.tmpl: http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl@24 PS6, Line 24: {{event_processor_error_msg}} nit: will we have event-id and decompressed event message in the event message. http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl@43 PS6, Line 43: Latest Event nit: should we say "Latest Event in Metastore"? -- To view, visit http://gerrit.cloudera.org:8080/20986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c Gerrit-Change-Number: 20986 Gerrit-PatchSet: 6 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 13 Mar 2024 00:27:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20886 ) Change subject: IMPALA-12626: Capture tables in query for log .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15492/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa Gerrit-Change-Number: 20886 Gerrit-PatchSet: 14 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 13 Mar 2024 00:13:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12737: List key columns in query history
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 ) Change subject: IMPALA-12737: List key columns in query history .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15493/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 13 Mar 2024 00:13:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12737: List key columns in query history
Michael Smith has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21142 Change subject: IMPALA-12737: List key columns in query history .. IMPALA-12737: List key columns in query history Adds a new column - key_columns - to impala_query_log and impala_query_live collecting columns that are important for estimating cardinality of the query. These include columns referenced in 'where', 'group by', and 'having' clauses of a 'select' query; and the 'where' clause of an 'update' or 'delete' statement. Column names are presented in a comma-separated list of fully qualified paths. Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/service/Frontend.java 14 files changed, 79 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/1 -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Smith
[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20886 to look at the new patch set (#14). Change subject: IMPALA-12626: Capture tables in query for log .. IMPALA-12626: Capture tables in query for log Currently requires 'drop table sys.impala_query_log' to recreate it with the new column. Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/custom_cluster/test_query_live.py M tests/util/workload_management.py 10 files changed, 55 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20886/14 -- To view, visit http://gerrit.cloudera.org:8080/20886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa Gerrit-Change-Number: 20886 Gerrit-PatchSet: 14 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21122 ) Change subject: IMPALA-12883: Support updating the charge on an entry in the cache .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/21122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025 Gerrit-Change-Number: 21122 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Tue, 12 Mar 2024 23:35:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 ) Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15491/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 Gerrit-Change-Number: 21141 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 12 Mar 2024 23:23:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 ) Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@809 PS1, Line 809: Exception > IOException is subclass of Exception. We found the exception was not catche nit: I'd suggest to narrow it down to subclass of Exception that is actually thrown, but I guess this is fine too to catch all. http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py File tests/custom_cluster/test_ext_data_sources.py: http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py@29 PS2, Line 29: : class TestExtDataSources(CustomClusterTestSuite): : """Impala query tests for external data sources.""" You can add_test_dimension to apply exec_single_node_rows_threshold = 100 for all tests here. from tests.common.test_dimensions import create_exec_option_dimension ... @classmethod def add_test_dimensions(cls): super(TestExtDataSources, cls).add_test_dimensions() cls.ImpalaTestMatrix.add_dimension(create_exec_option_dimension( exec_single_node_option=[100])) Similarly for TestImpalaExtJdbcTables. L52 and L254 then can be removed. -- To view, visit http://gerrit.cloudera.org:8080/21141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 Gerrit-Change-Number: 21141 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 12 Mar 2024 23:12:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table
Wenzhe Zhou has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/21141 ) Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table .. IMPALA-12896: Avoid JDBC table to be set as transactional table In some deployment environment, JDBC tables are set as transactional tables by default. This causes catalogd failed to load the metadata for JDBC tables. This patch explicitly add table properties with "transactional=false" for JDBC table to avoid the JDBC to be set as transactional table. The operations on JDBC table are processed only on coordinator. The processed rows should be estimated as 0 for DataSourceScanNode by planner so that coordinator-only query plans are generated for simple queries on JDBC tables and queries could be executed without invoking executor nodes. Also adds Preconditions.check to make sure numNodes equals 1 for DataSourceScanNode. Updates FileSystemUtil.copyFileFromUriToLocal() function to write log message for all types of exceptions. Testing: - Fixed planer tests for data source tables. - Passed core-tests. Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 --- M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/small-query-opt.test M tests/custom_cluster/test_ext_data_sources.py M tests/query_test/test_ext_data_sources.py 7 files changed, 33 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/21141/2 -- To view, visit http://gerrit.cloudera.org:8080/21141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 Gerrit-Change-Number: 21141 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-12426: Query History Table
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 43: (1 comment) http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/workload-management.cc@212 PS43, Line 212: /// completed queries table. See the initialization code in workload-management.cc for nit: we're already in workload-management.cc, might make sense to reference the function name instead. -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 43 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 12 Mar 2024 20:48:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12426: Query History Table
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 43: (7 comments) http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/impala-server.h@1285 PS43, Line 1285: typedef std::pair, uint8_t> completed_query_entry; nit: can this be a struct instead of pair? http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@79 PS41, Line 79: return false; : } : : if (val.find_first_of('"') != string::npos) { : LOG(ERROR) << "Invalid value for --" << name << ": must not contain double quotes."; : return false; : } : : if (val.find_first_of('\n') != string::npos) { : LOG(ERROR) << "Invalid value for --" << name << ": must not contain newlines."; : return false; : } : : return true; : }); : > This parameter gets passed directly to the "LOCATION" portion of the create Done http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@479 PS41, Line 479: : DCHECK(ImpaladMetrics::COMPLETED_QUERIES_QUEUED->GetValue() >= > Yes, each completed query is tried three times to be inserted into the comp Thank you for the explanation. http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/workload-management.cc@470 PS43, Line 470: iter->second += 1; nit: Add comment mentioning that this is an attempt increment. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py File tests/custom_cluster/test_query_log.py: http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@87 PS41, Line 87: == self.MA > There is not a well defined pattern for custom cluster tests. I like this Done http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@168 PS41, Line 168: "--shutdown_grace_period_s=10 " : "--shutdown_deadline_s=60", : catalogd_args="--enable_workload_mgmt", : > It possibly could. The test_query_log_table_ddl test is specifically to te In that case, please give distinctive comment for each of them. http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@835 PS41, Line 835: > I didn't use named arguments anywhere else, thus I don't want to include it Done -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 43 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 12 Mar 2024 20:15:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12426: Query History Table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 43: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15490/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 43 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 12 Mar 2024 19:56:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12426: Query History Table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 42: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15489/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 42 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 12 Mar 2024 19:52:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12426: Query History Table
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20770 to look at the new patch set (#43). Change subject: IMPALA-12426: Query History Table .. IMPALA-12426: Query History Table Adds the ability for users to specify that Impala will create and maintain an internal Iceberg table that contains data about all completed queries. This table is automatically created at startup by each coordinator if it does not exist. Then, most completed queries are queued in memory and flushed to the query history table at a set interval (either minutes or number of records). Set, use, and show queries are not written to this table. This commit leverages the InternalServer class to maintain the query history table. Ctest unit tests have been added to assert the various pieces of code. New custom cluster tests have been added to assert the query history table is properly populated with completed queries. Negative testing consists of attempting sql injection attacks and syntactically incorrect queries. Impala built-in string functions benchmarks have been updated to include the new built-in functions. Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 --- M be/src/runtime/query-driver.cc M be/src/runtime/query-driver.h M be/src/service/CMakeLists.txt M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/internal-server-test.cc M be/src/service/internal-server.cc M be/src/service/internal-server.h M be/src/service/query-state-record.cc A be/src/service/workload-management.cc A be/src/service/workload-management.h M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h A be/src/util/sql-util-test.cc A be/src/util/sql-util.cc A be/src/util/sql-util.h M common/thrift/BackendGflags.thrift M common/thrift/metrics.json M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java M tests/authorization/test_provider.py M tests/beeswax/impala_beeswax.py M tests/common/custom_cluster_test_suite.py M tests/common/impala_connection.py A tests/custom_cluster/test_query_log.py A tests/util/assert_time.py A tests/util/memory.py A tests/util/retry.py A tests/util/workload_management.py 33 files changed, 3,578 insertions(+), 72 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/43 -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 43 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12426: Query History Table
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 42: (34 comments) http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/query-state-record.cc File be/src/service/query-state-record.cc: http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/query-state-record.cc@208 PS39, Line 208: base_state->num_rows_fetched = exec_state.num_rows_fetched_counter(); > Nice, I'd noticed base could be null and fixed it in my patch. I only fixed this because I saw in your patch that you found and fixed it there! http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h File be/src/service/workload-management.h: http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h@40 PS41, Line 40: > Can this renamed with something more explanatory? Perhaps "management" or " Yup, renamed to "workload_management" http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h@102 PS41, Line 102: the comple > Looks like most columns are primitive types. So this can be replaced with T Done. http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h@248 PS41, Line 248: : // Query Options set by Configuration : FieldDefinition("query_opts_config", TPrimitiveType::STRING, : [](FieldParserContext& ctx){ : const std::string opts_str = DebugQueryOptions(ctx.record->query_options); : ctx.sql << "'" << EscapeSql(opts_str) << "'"; : }), : : // Resource Pool : FieldDefinition("resource_pool", TPrimitiveType::STRING, : [](FieldParserContext& ctx){ : ctx.sql << "'" << EscapeSql(ctx.record->base_state->resource_pool) << "'"; : }), : : // Per-host Memory Estimate : FieldDefinition("per_host_mem_estimate", TPrimitiveType::BIGINT, : [](FieldParserContext& ctx){ : ctx.sql << ctx.record->per_host_mem_estimate; : }), : : // Dedi > This can be feed directly to StringStream sql. This the how generated Types Done. That's much nicer! http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.h File be/src/service/workload-management.h: http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.h@42 PS39, Line 42: > This is going to allocate storage every place the header is included. Do we Nope, fixed. I left the constant declaration here but declared it extern. http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.h@51 PS39, Line 51: /// object that is passed to each parser. > This should still be extern, or it won't use the right value in all cases. Done http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.h@105 PS39, Line 105: /// `ctx` The field parse context object. > These should be `std::move`'d in to place. Right now it's making 2 copies o Done http://gerrit.cloudera.org:8080/#/c/20770/36/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/36/be/src/service/workload-management.cc@61 PS36, Line 61: > Decided this didn't need to be configurable? Yeah, after reviewing your active queries patch where the database was hardcoded to "sys", I decided to make the completed queries also be hardcoded to "sys" to match. http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.cc@206 PS39, Line 206: namespace workload_management { > This is still evaluated during static initialization, before query_log_writ Done http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.cc@354 PS39, Line 354: completed_queries_.emplace_back(make_pair(move(exp_rec), 0)); > Does this actually need to be a global? Looks like it's only used in this f nope, it does not. fixed http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@63 PS41, Line 63: "the query log table where completed queries will be stored."); : : DEFINE_validator(query_log_table_name, [](const char* name, const string& val) { : if (regex_match(val, alphanum_underscore_dash)) return true; : : LOG(ERROR) << "Invalid value for --" << name << ": must only contain
[Impala-ASF-CR] IMPALA-12426: Query History Table
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20770 to look at the new patch set (#42). Change subject: IMPALA-12426: Query History Table .. IMPALA-12426: Query History Table Adds the ability for users to specify that Impala will create and maintain an internal Iceberg table that contains data about all completed queries. This table is automatically created at startup by each coordinator if it does not exist. Then, most completed queries are queued in memory and flushed to the query history table at a set interval (either minutes or number of records). Set, use, and show queries are not written to this table. This commit leverages the InternalServer class to maintain the query history table. Ctest unit tests have been added to assert the various pieces of code. New custom cluster tests have been added to assert the query history table is properly populated with completed queries. Negative testing consists of attempting sql injection attacks and syntactically incorrect queries. Impala built-in string functions benchmarks have been updated to include the new built-in functions. Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 --- M be/src/runtime/query-driver.cc M be/src/runtime/query-driver.h M be/src/service/CMakeLists.txt M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/internal-server-test.cc M be/src/service/internal-server.cc M be/src/service/internal-server.h M be/src/service/query-state-record.cc A be/src/service/workload-management.cc A be/src/service/workload-management.h M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h A be/src/util/sql-util-test.cc A be/src/util/sql-util.cc A be/src/util/sql-util.h M be/src/util/thread.h M common/thrift/BackendGflags.thrift M common/thrift/metrics.json M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java M tests/authorization/test_provider.py M tests/beeswax/impala_beeswax.py M tests/common/custom_cluster_test_suite.py M tests/common/impala_connection.py A tests/custom_cluster/test_query_log.py A tests/util/assert_time.py A tests/util/memory.py A tests/util/retry.py A tests/util/workload_management.py 34 files changed, 3,579 insertions(+), 72 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/42 -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 42 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[native-toolchain-CR] IMPALA-12807: Add mold linker
Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21043 ) Change subject: IMPALA-12807: Add mold linker .. IMPALA-12807: Add mold linker This adds the mold linker to the toolchain. Mold is a new linker that claims to be significantly faster than gold/lld. Even though it won't be linked into the Impala executable, it uses C++20 and needs to be build with the toolchain GCC 10+ compiler. Testing: - Ran a local build and used mold to build Impala Change-Id: Ifc1ad3744ea9d0ff7bf93652a21542d2b2cf Reviewed-on: http://gerrit.cloudera.org:8080/21043 Reviewed-by: Michael Smith Tested-by: Joe McDonnell --- M buildall.sh A source/mold/build.sh 2 files changed, 44 insertions(+), 0 deletions(-) Approvals: Michael Smith: Looks good to me, approved Joe McDonnell: Verified -- To view, visit http://gerrit.cloudera.org:8080/21043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ifc1ad3744ea9d0ff7bf93652a21542d2b2cf Gerrit-Change-Number: 21043 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith
[native-toolchain-CR] IMPALA-12807: Add mold linker
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/21043 ) Change subject: IMPALA-12807: Add mold linker .. Patch Set 2: Verified+1 Thanks for the review, going ahead with this (builds pass on x86_64 and ARM) -- To view, visit http://gerrit.cloudera.org:8080/21043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc1ad3744ea9d0ff7bf93652a21542d2b2cf Gerrit-Change-Number: 21043 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 12 Mar 2024 19:28:35 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-12896: Avoid JDBC table to be set as transactional table
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 ) Change subject: WIP IMPALA-12896: Avoid JDBC table to be set as transactional table .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@809 PS1, Line 809: Exception Why this is widen from only catching IOException to all Exception? http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java File fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java: http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@36 PS1, Line 36: // Max number of rows processed across all instances of a plan node. : private long maxRowsProcessed_ ; : : // Max number of rows processed per backend impala daemon for a plan node. : private long maxRowsProcessedPerNode_; This a good chance to explicitly init these to 0. http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@49 PS1, Line 49: // Operations on DataSourceScanNode are processed on coordinator. : maxRowsProcessed_ = Math.max(maxRowsProcessed_, 0); : maxRowsProcessedPerNode_ = Math.max(maxRowsProcessedPerNode_, 0); Is this basically intended to be a no-op? If yes, probably better to leave it empty and just add Preconditions that numNodes == 1. Just like the very last else branch, this should check if caller is valid and return early when it is invalid. if ((in == -1) || (out == -1)) { valid_ = false; return; } At the end of this method, you can add instead add Preconditions that verify both maxRowsProcessed_ and maxRowsProcessedPerNode_ are always >= 0 after each visit. http://gerrit.cloudera.org:8080/#/c/21141/1/tests/custom_cluster/test_ext_data_sources.py File tests/custom_cluster/test_ext_data_sources.py: http://gerrit.cloudera.org:8080/#/c/21141/1/tests/custom_cluster/test_ext_data_sources.py@51 PS1, Line 51: vector.get_value('exec_option') Use deepcopy http://gerrit.cloudera.org:8080/#/c/21141/1/tests/custom_cluster/test_ext_data_sources.py@254 PS1, Line 254: vector.get_value('exec_option') Use deepcopy -- To view, visit http://gerrit.cloudera.org:8080/21141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 Gerrit-Change-Number: 21141 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 12 Mar 2024 19:25:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-12896: Avoid JDBC table to be set as transactional table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21141 ) Change subject: WIP IMPALA-12896: Avoid JDBC table to be set as transactional table .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15488/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 Gerrit-Change-Number: 21141 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 12 Mar 2024 19:23:12 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-12896: Avoid JDBC table to be set as transactional table
Wenzhe Zhou has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21141 Change subject: WIP IMPALA-12896: Avoid JDBC table to be set as transactional table .. WIP IMPALA-12896: Avoid JDBC table to be set as transactional table In some deployment environment, JDBC tables are set as transactional tables by default. This causes catalogd failed to load the metadata for JDBC tables. This patch explicitly add table properties with "transactional=false" for JDBC table to avoid the JDBC to be set as transactional table. The operations on JDBC table are processed only on coordinator. The processed rows should be estimated as 0 for DataSourceScanNode by planner so that coordinator-only query plans are generated for simple queries on JDBC tables and queries could be executed without invoking executor nodes. Updates FileSystemUtil.copyFileFromUriToLocal() function to write log message for all types of exceptions. Testing: - TODO: update planer tests for data source tables. - TODO: pass core -tests. Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 --- M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java M tests/custom_cluster/test_ext_data_sources.py M tests/query_test/test_ext_data_sources.py 5 files changed, 22 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/21141/1 -- To view, visit http://gerrit.cloudera.org:8080/21141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6 Gerrit-Change-Number: 21141 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21122 ) Change subject: IMPALA-12883: Support updating the charge on an entry in the cache .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10373/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025 Gerrit-Change-Number: 21122 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Tue, 12 Mar 2024 18:57:52 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS
Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/2 ) Change subject: IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS .. IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS The TPC-H and TPC-DS builds currently do not respect the CFLAGS environment variable, so they don't incorporate the values that we set in init-compiler.sh. This modifies the build scripts for TPC-H and TPC-DS to patch their makefiles to add our CFLAGS. This has the side effect of turning on -O3 optimization, resulting in faster binaries used to generate the TPC-H and TPC-DS datasets: TPC-H's dbgen at scale 42: Unoptimized: 4m46.269s Optimized: 3m46.379s TPC-DS's dsdgen at scale 20: Unoptimized: 9m41.441s Optimized: 7m25.017s Testing: - Ran a build and verified that the flags include our CFLAGS value Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a Reviewed-on: http://gerrit.cloudera.org:8080/2 Reviewed-by: Michael Smith Tested-by: Joe McDonnell --- M source/tpc-ds/build.sh M source/tpc-h/build.sh 2 files changed, 14 insertions(+), 0 deletions(-) Approvals: Michael Smith: Looks good to me, approved Joe McDonnell: Verified -- To view, visit http://gerrit.cloudera.org:8080/2 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a Gerrit-Change-Number: 2 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith
[native-toolchain-CR] IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/2 ) Change subject: IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS .. Patch Set 2: Verified+1 Thanks for the review, going ahead with this. -- To view, visit http://gerrit.cloudera.org:8080/2 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a Gerrit-Change-Number: 2 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 12 Mar 2024 18:42:28 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/2 ) Change subject: IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS .. Patch Set 2: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-ds/build.sh File source/tpc-ds/build.sh: http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-ds/build.sh@38 PS1, Line 38: sed -i -r "s/^(LINUX_CFLAGS\s*=.*)/\1 ${CFLAGS}/" makefile > Yes, I think this is going to accumulate if we run it multiple times. It's Ack http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-h/build.sh File source/tpc-h/build.sh: http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-h/build.sh@50 PS1, Line 50: # The build makefile doesn't respect the CFLAGS environment variable directly, > Synced these up Done -- To view, visit http://gerrit.cloudera.org:8080/2 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a Gerrit-Change-Number: 2 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 12 Mar 2024 18:41:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/21131 ) Change subject: IMPALA-11499: Refactor UrlEncode function to handle special characters .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc File be/src/util/coding-util.cc: http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc@84 PS1, Line 84: // Encode forward slashes to prevent misinterpretation in paths or URLs > Yes, that's correct. The UrlEncode function in the provided code is designe As Daniel said, please add a clearer description of the problem and solution to the commit message. What you just said here is a good basis for that. -- To view, visit http://gerrit.cloudera.org:8080/21131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb Gerrit-Change-Number: 21131 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Tue, 12 Mar 2024 18:40:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/21122 ) Change subject: IMPALA-12883: Support updating the charge on an entry in the cache .. Patch Set 5: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/21122/3/be/src/util/cache/cache-test.cc File be/src/util/cache/cache-test.cc: http://gerrit.cloudera.org:8080/#/c/21122/3/be/src/util/cache/cache-test.cc@180 PS3, Line 180: // Updating the charge for the cache entry evicts something > Oops, fixed Ack -- To view, visit http://gerrit.cloudera.org:8080/21122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025 Gerrit-Change-Number: 21122 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Tue, 12 Mar 2024 18:38:04 + Gerrit-HasComments: Yes
[native-toolchain-CR] IMPALA-12807: Add mold linker
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/21043 ) Change subject: IMPALA-12807: Add mold linker .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/21043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc1ad3744ea9d0ff7bf93652a21542d2b2cf Gerrit-Change-Number: 21043 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 12 Mar 2024 18:30:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21122 ) Change subject: IMPALA-12883: Support updating the charge on an entry in the cache .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15487/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025 Gerrit-Change-Number: 21122 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Tue, 12 Mar 2024 18:12:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21122 ) Change subject: IMPALA-12883: Support updating the charge on an entry in the cache .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15486/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025 Gerrit-Change-Number: 21122 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Tue, 12 Mar 2024 18:08:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache
Hello Kurt Deschler, Yida Wu, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21122 to look at the new patch set (#5). Change subject: IMPALA-12883: Support updating the charge on an entry in the cache .. IMPALA-12883: Support updating the charge on an entry in the cache The cache implementation currently assumes that a cache entry's charge will remain constant over time and does not support updating the charge. This works well for existing cache users like the data cache and codegen cache. However, it doesn't work as well for use cases where the final size is not known up front. Being able to update the charge also allows for avoiding concurrency issues where two different threads are trying to insert the same entry. This adds the ability to update an entry's charge after it has been inserted into the cache. This can trigger evictions if the size increases. This also adds a way to retrieve the maximum charge supported by the cache. This allows a cache user to tune its behavior to avoid generating entries that would exceed the maximum charge. Testing: - Added tests cases to the caching backend tests Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025 --- M be/src/util/cache/cache-internal.h M be/src/util/cache/cache-test.cc M be/src/util/cache/cache-test.h M be/src/util/cache/cache.h M be/src/util/cache/lirs-cache-test.cc M be/src/util/cache/lirs-cache.cc M be/src/util/cache/rl-cache-test.cc M be/src/util/cache/rl-cache.cc 8 files changed, 439 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/21122/5 -- To view, visit http://gerrit.cloudera.org:8080/21122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025 Gerrit-Change-Number: 21122 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu
[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/21122 ) Change subject: IMPALA-12883: Support updating the charge on an entry in the cache .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/21122/4/be/src/util/cache/lirs-cache.cc File be/src/util/cache/lirs-cache.cc: http://gerrit.cloudera.org:8080/#/c/21122/4/be/src/util/cache/lirs-cache.cc@865 PS4, Line 865: HandleBase* LIRSCacheShard::Allocate(Slice key, uint32_t hash, int val_len, size_t charge) { > line too long (92 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/21122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025 Gerrit-Change-Number: 21122 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Tue, 12 Mar 2024 17:47:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache
Hello Kurt Deschler, Yida Wu, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21122 to look at the new patch set (#4). Change subject: IMPALA-12883: Support updating the charge on an entry in the cache .. IMPALA-12883: Support updating the charge on an entry in the cache The cache implementation currently assumes that a cache entry's charge will remain constant over time and does not support updating the charge. This works well for existing cache users like the data cache and codegen cache. However, it doesn't work as well for use cases where the final size is not known up front. Being able to update the charge also allows for avoiding concurrency issues where two different threads are trying to insert the same entry. This adds the ability to update an entry's charge after it has been inserted into the cache. This can trigger evictions if the size increases. This also adds a way to retrieve the maximum charge supported by the cache. This allows a cache user to tune its behavior to avoid generating entries that would exceed the maximum charge. Testing: - Added tests cases to the caching backend tests Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025 --- M be/src/util/cache/cache-internal.h M be/src/util/cache/cache-test.cc M be/src/util/cache/cache-test.h M be/src/util/cache/cache.h M be/src/util/cache/lirs-cache-test.cc M be/src/util/cache/lirs-cache.cc M be/src/util/cache/rl-cache-test.cc M be/src/util/cache/rl-cache.cc 8 files changed, 438 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/21122/4 -- To view, visit http://gerrit.cloudera.org:8080/21122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025 Gerrit-Change-Number: 21122 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu
[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/21122 ) Change subject: IMPALA-12883: Support updating the charge on an entry in the cache .. Patch Set 3: (2 comments) I did one other change in the new upload: charge becomes a size_t in all interfaces (which means kAutomaticCharge is a size_t and set to the max val). That avoids issues about charge being negative, etc. http://gerrit.cloudera.org:8080/#/c/21122/3/be/src/util/cache/cache-test.cc File be/src/util/cache/cache-test.cc: http://gerrit.cloudera.org:8080/#/c/21122/3/be/src/util/cache/cache-test.cc@180 PS3, Line 180: // Updating the charge for the erased element evicts something > Why do you call h1 the erased element? Oops, fixed http://gerrit.cloudera.org:8080/#/c/21122/3/be/src/util/cache/lirs-cache-test.cc File be/src/util/cache/lirs-cache-test.cc: http://gerrit.cloudera.org:8080/#/c/21122/3/be/src/util/cache/lirs-cache-test.cc@290 PS3, Line 290: cache_->UpdateCharge(handle, cache_->MaxCharge()); > Do you want to validate the handle's charge doesn't change too? Added the ability to get the charge for a handle and checked that it doesn't change. -- To view, visit http://gerrit.cloudera.org:8080/21122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025 Gerrit-Change-Number: 21122 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Tue, 12 Mar 2024 17:44:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21122 ) Change subject: IMPALA-12883: Support updating the charge on an entry in the cache .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/21122/4/be/src/util/cache/lirs-cache.cc File be/src/util/cache/lirs-cache.cc: http://gerrit.cloudera.org:8080/#/c/21122/4/be/src/util/cache/lirs-cache.cc@865 PS4, Line 865: HandleBase* LIRSCacheShard::Allocate(Slice key, uint32_t hash, int val_len, size_t charge) { line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/21122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025 Gerrit-Change-Number: 21122 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Tue, 12 Mar 2024 17:45:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/21139 ) Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables .. Patch Set 1: Code-Review+1 (2 comments) You mentioned offline you're about to upload a test table for this. Otherwise LGTM! http://gerrit.cloudera.org:8080/#/c/21139/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21139/1//COMMIT_MSG@12 PS1, Line 12: login > What does "login" mean here? logic? http://gerrit.cloudera.org:8080/#/c/21139/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/21139/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@1473 PS1, Line 1473: o nit: missing space -- To view, visit http://gerrit.cloudera.org:8080/21139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237 Gerrit-Change-Number: 21139 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Mar 2024 17:20:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12818: Intermediate Result Caching plan node framework
Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/21035 ) Change subject: IMPALA-12818: Intermediate Result Caching plan node framework .. Patch Set 5: Code-Review+1 (3 comments) Looks good for an initial patch. Could be +2 if no other major issues http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java@530 PS3, Line 530: have not been calcula > Refactored this to be a ThriftSerializationCtx that is passed in. This code Done http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java@1237 PS3, Line 1237: > Reworked this comment to describe how this is a bottom-up tree traversal. Done http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java File fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java: http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java@59 PS3, Line 59: > Changed this to use StringBuilder. I also added an explicit finalize() call Done -- To view, visit http://gerrit.cloudera.org:8080/21035 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia1f36a87dcce6efd5d1e1f0bc04009bf009b1961 Gerrit-Change-Number: 21035 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Tue, 12 Mar 2024 17:14:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/21139 ) Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21139/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21139/1//COMMIT_MSG@12 PS1, Line 12: login What does "login" mean here? -- To view, visit http://gerrit.cloudera.org:8080/21139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237 Gerrit-Change-Number: 21139 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 12 Mar 2024 17:12:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21139 ) Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15485/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237 Gerrit-Change-Number: 21139 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 12 Mar 2024 17:08:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/21131 ) Change subject: IMPALA-11499: Refactor UrlEncode function to handle special characters .. Patch Set 3: (9 comments) Thanks, Pranav! http://gerrit.cloudera.org:8080/#/c/21131/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21131/1//COMMIT_MSG@9 PS1, Line 9: In this commit, a comprehensive mapping of special characters, including You should explain what the error in the previous version was and what failures it lead to. http://gerrit.cloudera.org:8080/#/c/21131/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21131/3//COMMIT_MSG@9 PS3, Line 9: In this commit, a comprehensive mapping of special characters, including The commit message should explain what errors the code before the change caused, what was wrong with the code and how it is fixed by this change. http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc File be/src/util/coding-util.cc: http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@48 PS3, Line 48: std::unordered_map specialCharacterMap = { Should be const. http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@71 PS3, Line 71: (*out).reserve(in_len * 3); // Maximum expansion per character is 3 I'm not sure why we have to reserve this. 'input_str' is modified (grown) in place. At the end of the function we could move 'input_str' into '*out' like this: (*out) = std::move(input_str); You should include for this overload of std::move. Or, alternatively, you could discard 'input_str' completely and start by copying the input into '*out' and use that variable in the string replacements. http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@78 PS3, Line 78: for (const auto& entry : specialCharacterMap) { Won't this replace existing % signs twice? First on L75, then in the loop because '%' is present as a key in specialCharacterMap. We could either remove '%' from the map or 1. Replace the map with an ordered std::array>. 2. Put '%' first in that std::array. http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@83 PS3, Line 83: if (!hive_compat) { Are these additional encodings the same that were done before this change? It doesn't seem like that. For example, I think "$" was encoded before but not now. If not, won't it cause problems? The commit message should at least mention how the behaviour changes. http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@85 PS3, Line 85: boost::replace_all(input_str, "/", "%2F"); Forward slashes have already been encoded because '/' is contained in specialCharacterMap. If it are not to be encoded in Hive-compat mode, it should be removed from the map. http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@90 PS3, Line 90: (*out) This only makes sense after assigning 'input_str' to 'out'. Also, could use out->shrink_to_fit(). http://gerrit.cloudera.org:8080/#/c/21131/3/testdata/workloads/functional-query/queries/QueryTest/unicode-column-name.test File testdata/workloads/functional-query/queries/QueryTest/unicode-column-name.test: http://gerrit.cloudera.org:8080/#/c/21131/3/testdata/workloads/functional-query/queries/QueryTest/unicode-column-name.test@298 PS3, Line 298: my_part_tbl A more descriptive name would be better, for example "unicode_partition_values". -- To view, visit http://gerrit.cloudera.org:8080/21131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb Gerrit-Change-Number: 21131 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Tue, 12 Mar 2024 16:53:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables
Gabor Kaszab has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21139 Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables .. IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables This is a part 1 change that turns off the count(*) optimisations for V2 tables as there is a correctness issue with it. The reason is that Spark compaction may leave some dangling delete files that messes up the login in Impala. Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237 --- M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables-hash-join.test M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test M tests/query_test/test_iceberg.py 6 files changed, 408 insertions(+), 244 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/21139/1 -- To view, visit http://gerrit.cloudera.org:8080/21139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237 Gerrit-Change-Number: 21139 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab
[native-toolchain-CR] IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/2 ) Change subject: IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-ds/build.sh File source/tpc-ds/build.sh: http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-ds/build.sh@38 PS1, Line 38: sed -i -r "s/^(LINUX_CFLAGS\s*=.*)/\1 ${CFLAGS}/" makefile > Could this be applied repeatedly on repeated builds? I don't think it would Yes, I think this is going to accumulate if we run it multiple times. It's messy, but I want our flags at the end so they override anything else already specified. Added a comment for this. http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-h/build.sh File source/tpc-h/build.sh: http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-h/build.sh@50 PS1, Line 50: # The build makefile doesn't respect the CFLAGS environment variable directly, > Why does line wrapping differ between this and the other file? Synced these up -- To view, visit http://gerrit.cloudera.org:8080/2 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a Gerrit-Change-Number: 2 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 12 Mar 2024 16:32:15 + Gerrit-HasComments: Yes
[native-toolchain-CR] IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS
Hello Michael Smith, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/2 to look at the new patch set (#2). Change subject: IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS .. IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS The TPC-H and TPC-DS builds currently do not respect the CFLAGS environment variable, so they don't incorporate the values that we set in init-compiler.sh. This modifies the build scripts for TPC-H and TPC-DS to patch their makefiles to add our CFLAGS. This has the side effect of turning on -O3 optimization, resulting in faster binaries used to generate the TPC-H and TPC-DS datasets: TPC-H's dbgen at scale 42: Unoptimized: 4m46.269s Optimized: 3m46.379s TPC-DS's dsdgen at scale 20: Unoptimized: 9m41.441s Optimized: 7m25.017s Testing: - Ran a build and verified that the flags include our CFLAGS value Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a --- M source/tpc-ds/build.sh M source/tpc-h/build.sh 2 files changed, 14 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/11/2/2 -- To view, visit http://gerrit.cloudera.org:8080/2 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a Gerrit-Change-Number: 2 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21138 ) Change subject: IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15484/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21138 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e Gerrit-Change-Number: 21138 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Mar 2024 16:12:52 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-12807: Add mold linker
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/21043 ) Change subject: IMPALA-12807: Add mold linker .. Patch Set 2: Tested this with a toolchain build and with Impala using this upstream change: https://gerrit.cloudera.org/#/c/21121/ -- To view, visit http://gerrit.cloudera.org:8080/21043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc1ad3744ea9d0ff7bf93652a21542d2b2cf Gerrit-Change-Number: 21043 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 12 Mar 2024 16:02:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/21118 ) Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality .. Patch Set 3: Code-Review+1 +1 from my part but I'm not familiar with this part of the code, so someone else should also see it. -- To view, visit http://gerrit.cloudera.org:8080/21118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32 Gerrit-Change-Number: 21118 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 12 Mar 2024 15:52:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/21138 ) Change subject: IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21138/1/fe/src/main/java/org/apache/impala/planner/PlanFragment.java File fe/src/main/java/org/apache/impala/planner/PlanFragment.java: http://gerrit.cloudera.org:8080/#/c/21138/1/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@709 PS1, Line 709: [%s] If 'mustRunOnCoord_' is true, do you think we should mention it here? It may be misleading because a fragment can be coord-only for other reasons as well, but the lack of the info could be taken to mean the fragment is not coord-only. -- To view, visit http://gerrit.cloudera.org:8080/21138 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e Gerrit-Change-Number: 21138 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Daniel Becker Gerrit-Comment-Date: Tue, 12 Mar 2024 15:49:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors
Daniel Becker has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21138 Change subject: IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors .. IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors On clusters with dedicated coordinators and executors the Iceberg metadata scanner fragment(s) can be scheduled to executors, for example during a join. The fragment in this case will fail a precondition check, because either the 'frontend_' object or the table will not be present. This change forces Iceberg metadata scanner fragments to be scheduled on the coordinator. It is not enough to set the DataPartition type to UNPARTITIONED, because unpartitioned fragments can still be scheduled on executors. This change introduces a new flag in the TPlanFragment thrift struct - if it is true, the fragment is always scheduled on the coordinator. Testing: - Added a regression test in test_coordinators.py. Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e --- M be/src/scheduling/scheduler.cc M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M tests/custom_cluster/test_coordinators.py 5 files changed, 45 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/21138/1 -- To view, visit http://gerrit.cloudera.org:8080/21138 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e Gerrit-Change-Number: 21138 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker
[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21138 ) Change subject: IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21138/1/tests/custom_cluster/test_coordinators.py File tests/custom_cluster/test_coordinators.py: http://gerrit.cloudera.org:8080/#/c/21138/1/tests/custom_cluster/test_coordinators.py@304 PS1, Line 304: ; flake8: E703 statement ends with a semicolon -- To view, visit http://gerrit.cloudera.org:8080/21138 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e Gerrit-Change-Number: 21138 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Mar 2024 15:50:24 + Gerrit-HasComments: Yes
[native-toolchain-CR] IMPALA-12886: Bump GoogleTest version to 1.14.0
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/21133 ) Change subject: IMPALA-12886: Bump GoogleTest version to 1.14.0 .. Patch Set 1: > Patch Set 1: > > Thanks Laszlo for working on this! > > On my Ubuntu 20.04.6 LTS I had to add -fPIC to CXXFLAGS otherwise I got > strange linker errors. Though I switched to GoogleTest 1.14.0 in a quite > hacky way, so maybe it was just because of that. Did you try this out on > Ubuntu 20? Not yet (although I definitely want to run it through some upstream-like testing), but it's a good point, thanks for calling it out. -- To view, visit http://gerrit.cloudera.org:8080/21133 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa52b4809a7c969e863518b182d2df63256f4c43 Gerrit-Change-Number: 21133 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Mar 2024 14:42:44 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-12697: Set FAIL ON PUBLISH to true by default
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/21134 ) Change subject: IMPALA-12697: Set FAIL_ON_PUBLISH to true by default .. Patch Set 1: Code-Review+2 LGTM! -- To view, visit http://gerrit.cloudera.org:8080/21134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I156269e8b3b1fa5d743a8ab5a83810001f7dd648 Gerrit-Change-Number: 21134 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Mar 2024 14:37:28 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-12886: Bump GoogleTest version to 1.14.0
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/21133 ) Change subject: IMPALA-12886: Bump GoogleTest version to 1.14.0 .. Patch Set 1: Thanks Laszlo for working on this! On my Ubuntu 20.04.6 LTS I had to add -fPIC to CXXFLAGS otherwise I got strange linker errors. Though I switched to GoogleTest 1.14.0 in a quite hacky way, so maybe it was just because of that. Did you try this out on Ubuntu 20? -- To view, visit http://gerrit.cloudera.org:8080/21133 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa52b4809a7c969e863518b182d2df63256f4c43 Gerrit-Change-Number: 21133 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Mar 2024 14:36:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21118 ) Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/21118/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21118/2//COMMIT_MSG@16 PS2, Line 16: alls > Nit: calls. Done http://gerrit.cloudera.org:8080/#/c/21118/2//COMMIT_MSG@16 PS2, Line 16: that > Nit: that? Done http://gerrit.cloudera.org:8080/#/c/21118/2//COMMIT_MSG@18 PS2, Line 18: evaluated > Nit: is evaluated? Done http://gerrit.cloudera.org:8080/#/c/21118/2//COMMIT_MSG@19 PS2, Line 19: filter > Nit: filters. Done -- To view, visit http://gerrit.cloudera.org:8080/21118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32 Gerrit-Change-Number: 21118 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 12 Mar 2024 13:53:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality
Hello Daniel Becker, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21118 to look at the new patch set (#3). Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality .. IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality IMPALA-12018 adds reduceCardinalityForScanNode to lower cardinality estimation when runtime filter is involved. It calls JoinNode.computeGenericJoinCardinality(). However, if the originating join node has FK-PK conjunct, it should be possible to obtain lower cardinality estimate by calling JoinNode.getFkPkJoinCardinality() instead. This patch adds that analysis and calls JoinNode.getFkPkJoinCardinality() when possible. It is, however, only limited for runtime filter that evaluated on storage layer to avoid severe underestimation. Runtime filters that evaluate at scan node at row level will keep using JoinNode.computeGenericJoinCardinality(). Testing: - Update TpcdsCpuCostPlannerTest. Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32 --- M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q25.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q26.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q27.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q29.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q31.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q33.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q34.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q42.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q45.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q46.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q48.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q49.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q52.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q53.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q55.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q56.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q60.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q61.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q63.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q66.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q68.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q72.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q73.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q79.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q89.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q91.test 39 files changed, 1,122 insertions(+), 1,095 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/21118/3 -- To view, visit http://gerrit.cloudera.org:8080/21118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType:
[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21029 ) Change subject: IMPALA-12543: Detect self-events before finishing DDL .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15483/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21029 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79 Gerrit-Change-Number: 21029 Gerrit-PatchSet: 16 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 12 Mar 2024 13:10:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21029 ) Change subject: IMPALA-12543: Detect self-events before finishing DDL .. Patch Set 16: (12 comments) http://gerrit.cloudera.org:8080/#/c/21029/14//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21029/14//COMMIT_MSG@48 PS14, Line 48: Testing: > Currently, we just have flaky tests like test_iceberg_self_events that can Added TestEventProcessingWithImpala. http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1203 PS14, Line 1203: : /** > Let's fix this as well, i.e. only log this when the version is added for th Done http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1054 PS14, Line 1054: CatalogServiceCatalog catalog, Table table, long newVersionNumber) { > 'isInsertEvent' is always used as false. I think we can't track the infligh Done http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1080 PS14, Line 1080: * Cancel the new version number registration out of table's in-flight events. > nit: please add an error message. Done http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1221 PS14, Line 1221: : responseSum > nit: "Update the table object with the new catalog version" Done http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1248 PS14, Line 1248: : case DROP_PARTITION: : TAlterTableDropPartitionParams dropPartPa > nit: this comment is stale Removed. http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1376 PS14, Line 1376: > nit: maybe 'modification.isInProgress()' sounds better Done http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1592 PS14, Line 1592: if (!needsToUpdateHms) { > Can we remove this if the Iceberg operations won't update HMS? It also seem This is still needed, because ALTER_TABLE operation via Iceberg's HiveCatalog can produce event as well. What needsToUpdateHms means is whether caller of alterIcebergTable() need to follow up with separate HMS alter table call that is not through Iceberg library. http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2650 PS14, Line 2650: if (partition.getPartitionStatsCompressed() != null) { > This is unchanged yet. Changed to use InProgressTableModification. http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3239 PS14, Line 3239: try { > This is also unchanged but it's a bit complex - we are not always triggerin Changed all three truncate* function to return InProgressTableModification. This truncateTable() table only need to call markInflightEventRegistrationComplete(). http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5512 PS14, Line 5512: String.format(HMS_RPC_ERROR_FORMAT_STR, "alter_table"), e); > This is also unchanged for RENAME but I think we can't do it before the HMS Add TODO for future investigation. http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7242 PS14, Line 7242: > This is also unchanged for inserting Iceberg tables. Changed to use InProgressTableModification. -- To view, visit http://gerrit.cloudera.org:8080/21029 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79 Gerrit-Change-Number: 21029 Gerrit-PatchSet: 16 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 12 Mar 2024 12:58:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL
Hello Quanlong Huang, Jason Fehr, Sai Hemanth Gantasala, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21029 to look at the new patch set (#16). Change subject: IMPALA-12543: Detect self-events before finishing DDL .. IMPALA-12543: Detect self-events before finishing DDL test_iceberg_self_events has been flaky for not having tbls_refreshed_before equal to tbls_refreshed_after in-between query executions. Further investigation reveals concurrency bug due to db/table level lock is not taken during db/table self-events check (IMPALA-12461 part1). The order of ALTER TABLE operation is as follow: 1. alter table starts in CatalogOpExecutor 2. table level lock is taken 3. HMS RPC starts (CatalogOpExecutor.applyAlterTable()) 4. HMS generates the event 5. HMS RPC returns 6. table is reloaded 7. catalog version is added to inflight event list 8. table level lock is releases Meanwhile the event processor thread fetches the new event after 4 and before 7. Because of IMPALA-12461 (part 1), it can also finish self-events checking before reaching 7. Before IMPALA-12461, self-events would have needed to wait for 8. Note that this issue is only relevant for table level events, as self-events checking for partition level events still takes table lock. This patch fix the issue by adding newCatalogVersion to the table's inflight event list before updating HMS. If HMS update does not complete (ie., an exception is thrown), the new newCatalogVersion that was added is then removed. This patch also fix few smaller issues, including: - Avoid incrementing EVENTS_SKIPPED_METRIC if numFilteredEvents == 0 in MetastoreEventFactory.getFilteredEvents(). - Increment EVENTS_SKIPPED_METRIC in MetastoreTableEvent.reloadTableFromCatalog() if table is already in the middle of reloading (revealed through flaky test_skipping_older_events). - Rephrase misleading log message in MetastoreEventProcessor.getNextMetastoreEvents(). Testing: - Add TestEventProcessingWithImpala, run it with debug_action and sync_ddl dimensions. - Pass exhaustive tests. Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M tests/custom_cluster/test_events_custom_configs.py 9 files changed, 757 insertions(+), 510 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/21029/16 -- To view, visit http://gerrit.cloudera.org:8080/21029 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79 Gerrit-Change-Number: 21029 Gerrit-PatchSet: 16 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21072 ) Change subject: IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification .. IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification HdfsTable.toMinimalTCatalogObject() is not always invoked with holding the table lock, e.g. in invalidating a table, we could replace an HdfsTable instance with an IncompleteTable instance. We then invoke HdfsTable.toMinimalTCatalogObject() to get the removed catalog object. However, the HdfsTable instance could be modified in the meantime by a concurrent DDL/DML that would reload it, e.g. a REFRESH statement. This causes HdfsTable.toMinimalTCatalogObject() failed by ConcurrentModificationException on the column/partition list. This patch fixes the issue by try acquiring the table read lock in HdfsTable.toMinimalTCatalogObject(). If it fails, the partition ids and names won't be returned. Also updates the method to not collecting the column list since it's unused. Tests - Added e2e test - Ran CORE tests Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3 Reviewed-on: http://gerrit.cloudera.org:8080/21072 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/custom_cluster/test_concurrent_ddls.py 3 files changed, 70 insertions(+), 15 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/21072 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3 Gerrit-Change-Number: 21072 Gerrit-PatchSet: 5 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21072 ) Change subject: IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/21072 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3 Gerrit-Change-Number: 21072 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 12 Mar 2024 12:06:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1785 PS4, Line 1785: skipFileMetadata nit: "skipFileMetadataReload" sounds better http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1792 PS4, Line 1792: if (!(isNonTrivialSdPropsChanged(beforeTable.getSd(), afterTable.getSd( { : skipFileMetadata = true; : } else { : skipFileMetadata = false; : } This means skipFileMetadata = !isNonTrivialSdPropsChanged() regardless of what its previous value is. When the table owner changed and there are only trivial SD props changed, shouldn't we have skipFileMetadata = true? The current code set skipFileMetadata to false finally. http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1860 PS4, Line 1860: file metadata reload can be skipped Shouldn't this be "file metadata should be reloaded"? http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1873 PS4, Line 1873: table schema reload can be skipped nit: "file metadata should be reloaded" http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@a3035 PS4, Line 3035: Why do we remove existing tests in testAlterTableNoFileMetadataReload()? -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 12 Mar 2024 08:09:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21072 ) Change subject: IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10372/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/21072 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3 Gerrit-Change-Number: 21072 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 12 Mar 2024 07:19:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21072 ) Change subject: IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification .. Patch Set 4: > Patch Set 4: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10371/ The failure is unrelated: IMPALA-11396 Rerun the job. -- To view, visit http://gerrit.cloudera.org:8080/21072 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3 Gerrit-Change-Number: 21072 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 12 Mar 2024 07:19:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21131 ) Change subject: IMPALA-11499: Refactor UrlEncode function to handle special characters .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15482/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb Gerrit-Change-Number: 21131 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Tue, 12 Mar 2024 06:22:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters
pranav.lo...@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/21131 ) Change subject: IMPALA-11499: Refactor UrlEncode function to handle special characters .. Patch Set 3: (4 comments) > Uploaded patch set 3. http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc File be/src/util/coding-util.cc: http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc@47 PS1, Line 47: // Mapping of special characters to their URL-encoded forms > With this change it doesn't look like we need this list or ShouldNotEscape Done http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc@75 PS1, Line 75: boost::replace_all(input_str, "%", "%25"); > line too long (94 > 90) Done http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc@84 PS1, Line 84: // Encode forward slashes to prevent misinterpretation in paths or URLs > So this works because it only replaces specific chars - that won't appear i Yes, that's correct. The UrlEncode function in the provided code is designed to work with UTF-8 encoded strings, where each character can be represented by one to four bytes. The key is that it replaces specific characters with their URL-encoded forms, and these replacements are carefully chosen to avoid interfering with multi-byte characters.The replacements are safe because they don't disrupt the structure of multi-byte characters in UTF-8 encoding. The % sign followed by two hexadecimal digits, used in URL encoding, ensures that the replacements won't be confused with the actual encoding of multi-byte characters. On the other hand, using a generic isalnum check to determine whether a character should be replaced might not be suitable for UTF-8 encoded strings. It could inadvertently affect multi-byte characters, potentially leading to incorrect or corrupted results. http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc@95 PS1, Line 95: void UrlEncode(const vector& in, string* out, bool hive_compat) { > We could return unused capacity with https://en.cppreference.com/w/cpp/stri Done -- To view, visit http://gerrit.cloudera.org:8080/21131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb Gerrit-Change-Number: 21131 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Tue, 12 Mar 2024 05:56:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters
pranav.lo...@cloudera.com has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/21131 ) Change subject: IMPALA-11499: Refactor UrlEncode function to handle special characters .. IMPALA-11499: Refactor UrlEncode function to handle special characters In this commit, a comprehensive mapping of special characters, including Unicode, to their respective URL-encoded forms has been introduced. The UrlEncode function has been modified to dynamically iterate through this mapping, providing a more generic and scalable approach to special character encoding. This change also enhances code readability and maintainability by incorporating detailed comments explaining each section of the updated implementation. Testing: Some basic tests are provided in unicode-column-name.test. Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb --- M be/src/util/coding-util.cc M testdata/workloads/functional-query/queries/QueryTest/unicode-column-name.test 2 files changed, 78 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/21131/3 -- To view, visit http://gerrit.cloudera.org:8080/21131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb Gerrit-Change-Number: 21131 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zihao Ye