[Impala-ASF-CR] [WIP] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: [WIP] IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15496/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 05:23:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-12 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21143


Change subject: [WIP] IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..

[WIP] IMPALA-12856: Event processor should ignore processing partition
with empty partition values

While processing partition related events, Event Processor (EP) is
facing IllegalStateException if the partition fetched from HMS has
empty partition values. Even though this is a bug in HMS which returns
partitions with empty values, EP should ignore such partitions instead
of throwing IllegalStateException.

Note: Added a hidden boolean config is_return_empty_partition_values to
add malformed partition objects.

Testing:
- Manually verified the test provided in jira details in local env.
- Added unit test to return empty partition values and EP state.

Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
7 files changed, 60 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/1
--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-12 Thread Anonymous Coward (Code Review)
k.venureddy2...@gmail.com has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 6: Code-Review+1

(1 comment)

Looks good to me. Just have one comment.

http://gerrit.cloudera.org:8080/#/c/20986/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20986/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1192
PS3, Line 1192: NotificationEvent e = events.get(events.size() - 1);
> Nice catch!
Sorry i didn't realise that resetProgress() was called in finally. It is not 
required to call resetProgress() here.



--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 04:57:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10374/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 13 Mar 2024 03:09:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15495/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 13 Mar 2024 03:05:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-12 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
..


Patch Set 3: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
File fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java:

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@36
PS1, Line 36: // Max number of rows processed across all instances of a plan 
node.
:   private long maxRowsProcessed_ = 0;
:
:   // Max number of rows processed per backend impala daemon for a 
plan node.
:   private long maxRowsProcessedPerNode_
> fixed
Done


http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@49
PS1, Line 49: // Operations on DataSourceScanNode are processed on coordinator.
:   if (fragment == null) {
: numNodes = ((DataSourceScanNode)caller).getNumNodes();
> fixed as suggested
Done


http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py
File tests/custom_cluster/test_ext_data_sources.py:

http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py@29
PS2, Line 29:
:
: class TestExtDataSources(CustomClusterTestSuite):
> Fixed as suggested. Thanks
Done



--
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 13 Mar 2024 02:55:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-12 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@809
PS1, Line 809: Exception
> nit: I'd suggest to narrow it down to subclass of Exception that is
 > actually thrown, but I guess this is fine too to catch all.

I will keep the code to catch all exception.


http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py
File tests/custom_cluster/test_ext_data_sources.py:

http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py@29
PS2, Line 29:
:
: class TestExtDataSources(CustomClusterTestSuite):
> You can add_test_dimension to apply exec_single_node_rows_threshold = 100 f
Fixed as suggested. Thanks



--
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 13 Mar 2024 02:42:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-12 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
..

IMPALA-12896: Avoid JDBC table to be set as transactional table

In some deployment environment, JDBC tables are set as transactional
tables by default. This causes catalogd failed to load the metadata for
JDBC tables. This patch explicitly add table properties with
"transactional=false" for JDBC table to avoid the JDBC to be set as
transactional table.

The operations on JDBC table are processed only on coordinator. The
processed rows should be estimated as 0 for DataSourceScanNode by
planner so that coordinator-only query plans are generated for simple
queries on JDBC tables and queries could be executed without invoking
executor nodes. Also adds Preconditions.check to make sure numNodes
equals 1 for DataSourceScanNode.

Updates FileSystemUtil.copyFileFromUriToLocal() function to write log
message for all types of exceptions.

Testing:
 - Fixed planer tests for data source tables.
 - Ran end-to-end tests of JDBC tables with query option
   'exec_single_node_rows_threshold' as default value 100.
 - Passed core-tests.

Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
---
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/small-query-opt.test
M tests/custom_cluster/test_ext_data_sources.py
M tests/query_test/test_ext_data_sources.py
7 files changed, 42 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/21141/3
--
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12264: Add limit on number of HS2 sessions per user.

2024-03-12 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21128 )

Change subject: IMPALA-12264: Add limit on number of HS2 sessions per user.
..


Patch Set 1:

CDPD-58529 / IMPALA-12264 requests that we should “prevent a rogue application 
from repeatedly connecting to and monopolizing Impala”, and suggests “we can 
add configurations to limit concurrent connections”. The structure of thrift is 
such that it is hard to limit actual connections. For now the plan is to limit 
the number of HS2 sessions per user. This is a fairly straightforward change 
that I believe implements the main idea which is to prevent rogue applications 
as the jira says. As with some other features (e.g. 
-disconnected_session_timeout) this is just implemented for HiveServer2. As 
Beeswax is now deprecated it is suggested that customers disable beeswax access 
 for clients by setting the  --beeswax_port (Impala Daemon Beeswax Port) flag 
to 0.


--
To view, visit http://gerrit.cloudera.org:8080/21128
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd28edc352102d89774f6ece5376e7c79ae41aa8
Gerrit-Change-Number: 21128
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 13 Mar 2024 01:10:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15494/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 6
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 00:52:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-12 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1785
PS4, Line 1785: skipFileMetadata
> nit: "skipFileMetadataReload" sounds better
Done


http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1792
PS4, Line 1792:   if (!(isNonTrivialSdPropsChanged(beforeTable.getSd(), 
afterTable.getSd( {
  : skipFileMetadata = true;
  :   } else {
  : skipFileMetadata = false;
  :   }
> This means skipFileMetadata = !isNonTrivialSdPropsChanged() regardless of w
I'm reverting back to patchset 2's logic of skipping file metadata. 
Reason being, it is not common to change two different things in the same alter 
table query. I have tried different combinations of tblproperties/Sd 
properties/schema change and owner change in the same alter table query and I 
cannot seem to find the right logic that always works.
So I think users would have to set file_metadata_reload="" if two options are 
changing in single alter table statement.


http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1860
PS4, Line 1860: file metadata reload can be skipped
> Shouldn't this be "file metadata should be reloaded"?
Sorry my bad.


http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1873
PS4, Line 1873: table schema reload can be skipped
> nit: "file metadata should be reloaded"
Ack


http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@a3035
PS4, Line 3035:
> Why do we remove existing tests in testAlterTableNoFileMetadataReload()?
Good catch. This is not intentional. I'll revert it back.



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 00:29:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-12 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21019

to look at the new patch set (#6).

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..

IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories=true. The file metadata reload can be
skipped for all other changes in SD. Also introduced a small
optimization to skip reloading of table schema when ALTER_TABLE changes
location, rowformat, fileformat, and serde in the Storage Descriptor.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
5 files changed, 149 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/6
--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 6
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-12 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 6: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/20986/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20986/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1234
PS6, Line 1234:   logEventMetrics(eventProcessingTime, elapsedNs);
Nit: we could probably remove L#1195 and L#1226 and add that here.


http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl
File www/events.tmpl:

http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl@24
PS6, Line 24: {{event_processor_error_msg}}
nit: will we have event-id and decompressed event message in the event message.


http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl@43
PS6, Line 43: Latest Event
nit: should we say "Latest Event in Metastore"?



--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 00:27:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20886 )

Change subject: IMPALA-12626: Capture tables in query for log
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15492/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 14
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 13 Mar 2024 00:13:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12737: List key columns in query history

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21142 )

Change subject: IMPALA-12737: List key columns in query history
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15493/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
Gerrit-Change-Number: 21142
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 13 Mar 2024 00:13:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12737: List key columns in query history

2024-03-12 Thread Michael Smith (Code Review)
Michael Smith has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21142


Change subject: IMPALA-12737: List key columns in query history
..

IMPALA-12737: List key columns in query history

Adds a new column - key_columns - to impala_query_log and
impala_query_live collecting columns that are important for estimating
cardinality of the query. These include columns referenced in
'where', 'group by', and 'having' clauses of a 'select' query; and the
'where' clause of an 'update' or 'delete' statement.

Column names are presented in a comma-separated list of fully qualified
paths.

Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
---
M be/src/exec/system-table-scanner.cc
M be/src/service/client-request-state.h
M be/src/service/query-state-record.cc
M be/src/service/query-state-record.h
M be/src/service/workload-management.h
M common/thrift/Frontend.thrift
M common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
14 files changed, 79 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/1
--
To view, visit http://gerrit.cloudera.org:8080/21142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
Gerrit-Change-Number: 21142
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 


[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log

2024-03-12 Thread Michael Smith (Code Review)
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20886

to look at the new patch set (#14).

Change subject: IMPALA-12626: Capture tables in query for log
..

IMPALA-12626: Capture tables in query for log

Currently requires 'drop table sys.impala_query_log' to recreate it with
the new column.

Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
---
M be/src/exec/system-table-scanner.cc
M be/src/service/client-request-state.h
M be/src/service/query-state-record.cc
M be/src/service/query-state-record.h
M be/src/service/workload-management.h
M common/thrift/Frontend.thrift
M common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_query_live.py
M tests/util/workload_management.py
10 files changed, 55 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20886/14
--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 14
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21122 )

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 12 Mar 2024 23:35:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15491/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 12 Mar 2024 23:23:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-12 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@809
PS1, Line 809: Exception
> IOException is subclass of Exception. We found the exception was not catche
nit: I'd suggest to narrow it down to subclass of Exception that is actually 
thrown, but I guess this is fine too to catch all.


http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py
File tests/custom_cluster/test_ext_data_sources.py:

http://gerrit.cloudera.org:8080/#/c/21141/2/tests/custom_cluster/test_ext_data_sources.py@29
PS2, Line 29:
: class TestExtDataSources(CustomClusterTestSuite):
:   """Impala query tests for external data sources."""
You can add_test_dimension to apply exec_single_node_rows_threshold = 100 for 
all tests here.

from tests.common.test_dimensions import create_exec_option_dimension

...

  @classmethod
  def add_test_dimensions(cls):
super(TestExtDataSources, cls).add_test_dimensions()
cls.ImpalaTestMatrix.add_dimension(create_exec_option_dimension(
  exec_single_node_option=[100]))

Similarly for TestImpalaExtJdbcTables.
L52 and L254 then can be removed.



--
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 12 Mar 2024 23:12:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-12 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/21141 )

Change subject: IMPALA-12896: Avoid JDBC table to be set as transactional table
..

IMPALA-12896: Avoid JDBC table to be set as transactional table

In some deployment environment, JDBC tables are set as transactional
tables by default. This causes catalogd failed to load the metadata for
JDBC tables. This patch explicitly add table properties with
"transactional=false" for JDBC table to avoid the JDBC to be set as
transactional table.

The operations on JDBC table are processed only on coordinator. The
processed rows should be estimated as 0 for DataSourceScanNode by
planner so that coordinator-only query plans are generated for simple
queries on JDBC tables and queries could be executed without invoking
executor nodes. Also adds Preconditions.check to make sure numNodes
equals 1 for DataSourceScanNode.

Updates FileSystemUtil.copyFileFromUriToLocal() function to write log
message for all types of exceptions.

Testing:
 - Fixed planer tests for data source tables.
 - Passed core-tests.

Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
---
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/small-query-opt.test
M tests/custom_cluster/test_ext_data_sources.py
M tests/query_test/test_ext_data_sources.py
7 files changed, 33 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/21141/2
--
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-12 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 43:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/workload-management.cc
File be/src/service/workload-management.cc:

http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/workload-management.cc@212
PS43, Line 212: /// completed queries table. See the initialization code in 
workload-management.cc for
nit: we're already in workload-management.cc, might make sense to reference the 
function name instead.



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 43
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 12 Mar 2024 20:48:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-12 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 43:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/impala-server.h@1285
PS43, Line 1285: typedef std::pair, 
uint8_t> completed_query_entry;
nit: can this be a struct instead of pair?


http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc
File be/src/service/workload-management.cc:

http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@79
PS41, Line 79: return false;
 :   }
 :
 :   if (val.find_first_of('"') != string::npos) {
 : LOG(ERROR) << "Invalid value for --" << name << ": must not 
contain double quotes.";
 : return false;
 :   }
 :
 :   if (val.find_first_of('\n') != string::npos) {
 : LOG(ERROR) << "Invalid value for --" << name << ": must not 
contain newlines.";
 : return false;
 :   }
 :
 :   return true;
 : });
 :
> This parameter gets passed directly to the "LOCATION" portion of the create
Done


http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@479
PS41, Line 479:
  :   
DCHECK(ImpaladMetrics::COMPLETED_QUERIES_QUEUED->GetValue() >=
> Yes, each completed query is tried three times to be inserted into the comp
Thank you for the explanation.


http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/workload-management.cc
File be/src/service/workload-management.cc:

http://gerrit.cloudera.org:8080/#/c/20770/43/be/src/service/workload-management.cc@470
PS43, Line 470: iter->second += 1;
nit: Add comment mentioning that this is an attempt increment.


http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py
File tests/custom_cluster/test_query_log.py:

http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@87
PS41, Line 87: == self.MA
> There is not a well defined pattern for custom cluster tests.  I like this
Done


http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@168
PS41, Line 168:
"--shutdown_grace_period_s=10 "
  :  
"--shutdown_deadline_s=60",
  : 
catalogd_args="--enable_workload_mgmt",
  :
> It possibly could.  The test_query_log_table_ddl test is specifically to te
In that case, please give distinctive comment for each of them.


http://gerrit.cloudera.org:8080/#/c/20770/41/tests/custom_cluster/test_query_log.py@835
PS41, Line 835:
> I didn't use named arguments anywhere else, thus I don't want to include it
Done



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 43
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 12 Mar 2024 20:15:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 43:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15490/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 43
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 12 Mar 2024 19:56:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 42:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15489/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 42
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 12 Mar 2024 19:52:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-12 Thread Jason Fehr (Code Review)
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20770

to look at the new patch set (#43).

Change subject: IMPALA-12426: Query History Table
..

IMPALA-12426: Query History Table

Adds the ability for users to specify that Impala will create and
maintain an internal Iceberg table that contains data about all
completed queries. This table is automatically created at startup by
each coordinator if it does not exist. Then, most completed queries are
queued in memory and flushed to the query history table at a set
interval (either minutes or number of records). Set, use, and show
queries are not written to this table. This commit leverages the
InternalServer class to maintain the query history table.

Ctest unit tests have been added to assert the various pieces of code.
New custom cluster tests have been added to assert the query history
table is properly populated with completed queries.

Negative testing consists of attempting sql injection attacks and
syntactically incorrect queries.

Impala built-in string functions benchmarks have been updated to include
the new built-in functions.

Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/CMakeLists.txt
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/internal-server-test.cc
M be/src/service/internal-server.cc
M be/src/service/internal-server.h
M be/src/service/query-state-record.cc
A be/src/service/workload-management.cc
A be/src/service/workload-management.h
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
A be/src/util/sql-util-test.cc
A be/src/util/sql-util.cc
A be/src/util/sql-util.h
M common/thrift/BackendGflags.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java
M tests/authorization/test_provider.py
M tests/beeswax/impala_beeswax.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_connection.py
A tests/custom_cluster/test_query_log.py
A tests/util/assert_time.py
A tests/util/memory.py
A tests/util/retry.py
A tests/util/workload_management.py
33 files changed, 3,578 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/43
--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 43
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-12 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 42:

(34 comments)

http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/query-state-record.cc
File be/src/service/query-state-record.cc:

http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/query-state-record.cc@208
PS39, Line 208:   base_state->num_rows_fetched = 
exec_state.num_rows_fetched_counter();
> Nice, I'd noticed base could be null and fixed it in my patch.
I only fixed this because I saw in your patch that you found and fixed it there!


http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h
File be/src/service/workload-management.h:

http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h@40
PS41, Line 40:
> Can this renamed with something more explanatory? Perhaps "management" or "
Yup, renamed to "workload_management"


http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h@102
PS41, Line 102:  the comple
> Looks like most columns are primitive types. So this can be replaced with T
Done.


http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.h@248
PS41, Line 248:
  : // Query Options set by Configuration
  : FieldDefinition("query_opts_config", TPrimitiveType::STRING,
  : [](FieldParserContext& ctx){
  :   const std::string opts_str = 
DebugQueryOptions(ctx.record->query_options);
  :   ctx.sql << "'" << EscapeSql(opts_str) << "'";
  : }),
  :
  : // Resource Pool
  : FieldDefinition("resource_pool", TPrimitiveType::STRING,
  : [](FieldParserContext& ctx){
  :   ctx.sql << "'" << 
EscapeSql(ctx.record->base_state->resource_pool) << "'";
  : }),
  :
  : // Per-host Memory Estimate
  : FieldDefinition("per_host_mem_estimate", 
TPrimitiveType::BIGINT,
  : [](FieldParserContext& ctx){
  :   ctx.sql << ctx.record->per_host_mem_estimate;
  : }),
  :
  : // Dedi
> This can be feed directly to StringStream sql. This the how generated Types
Done.  That's much nicer!


http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.h
File be/src/service/workload-management.h:

http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.h@42
PS39, Line 42:
> This is going to allocate storage every place the header is included. Do we
Nope, fixed.  I left the constant declaration here but declared it extern.


http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.h@51
PS39, Line 51: /// object that is passed to each parser.
> This should still be extern, or it won't use the right value in all cases.
Done


http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.h@105
PS39, Line 105: ///   `ctx`  The field parse context object.
> These should be `std::move`'d in to place. Right now it's making 2 copies o
Done


http://gerrit.cloudera.org:8080/#/c/20770/36/be/src/service/workload-management.cc
File be/src/service/workload-management.cc:

http://gerrit.cloudera.org:8080/#/c/20770/36/be/src/service/workload-management.cc@61
PS36, Line 61:
> Decided this didn't need to be configurable?
Yeah, after reviewing your active queries patch where the database was 
hardcoded to "sys", I decided to make the completed queries also be hardcoded 
to "sys" to match.


http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.cc
File be/src/service/workload-management.cc:

http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.cc@206
PS39, Line 206: namespace workload_management {
> This is still evaluated during static initialization, before query_log_writ
Done


http://gerrit.cloudera.org:8080/#/c/20770/39/be/src/service/workload-management.cc@354
PS39, Line 354: completed_queries_.emplace_back(make_pair(move(exp_rec), 
0));
> Does this actually need to be a global? Looks like it's only used in this f
nope, it does not.  fixed


http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc
File be/src/service/workload-management.cc:

http://gerrit.cloudera.org:8080/#/c/20770/41/be/src/service/workload-management.cc@63
PS41, Line 63:   "the query log table where completed queries will be stored.");
 :
 : DEFINE_validator(query_log_table_name, [](const char* name, 
const string& val) {
 :   if (regex_match(val, alphanum_underscore_dash)) return true;
 :
 :   LOG(ERROR) << "Invalid value for --" << name << ": must only 
contain 

[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-12 Thread Jason Fehr (Code Review)
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20770

to look at the new patch set (#42).

Change subject: IMPALA-12426: Query History Table
..

IMPALA-12426: Query History Table

Adds the ability for users to specify that Impala will create and
maintain an internal Iceberg table that contains data about all
completed queries. This table is automatically created at startup by
each coordinator if it does not exist. Then, most completed queries are
queued in memory and flushed to the query history table at a set
interval (either minutes or number of records). Set, use, and show
queries are not written to this table. This commit leverages the
InternalServer class to maintain the query history table.

Ctest unit tests have been added to assert the various pieces of code.
New custom cluster tests have been added to assert the query history
table is properly populated with completed queries.

Negative testing consists of attempting sql injection attacks and
syntactically incorrect queries.

Impala built-in string functions benchmarks have been updated to include
the new built-in functions.

Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/CMakeLists.txt
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/internal-server-test.cc
M be/src/service/internal-server.cc
M be/src/service/internal-server.h
M be/src/service/query-state-record.cc
A be/src/service/workload-management.cc
A be/src/service/workload-management.h
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
A be/src/util/sql-util-test.cc
A be/src/util/sql-util.cc
A be/src/util/sql-util.h
M be/src/util/thread.h
M common/thrift/BackendGflags.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java
M tests/authorization/test_provider.py
M tests/beeswax/impala_beeswax.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_connection.py
A tests/custom_cluster/test_query_log.py
A tests/util/assert_time.py
A tests/util/memory.py
A tests/util/retry.py
A tests/util/workload_management.py
34 files changed, 3,579 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/42
--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 42
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[native-toolchain-CR] IMPALA-12807: Add mold linker

2024-03-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21043 )

Change subject: IMPALA-12807: Add mold linker
..

IMPALA-12807: Add mold linker

This adds the mold linker to the toolchain. Mold is a new
linker that claims to be significantly faster than gold/lld.
Even though it won't be linked into the Impala executable, it
uses C++20 and needs to be build with the toolchain GCC 10+
compiler.

Testing:
 - Ran a local build and used mold to build Impala

Change-Id: Ifc1ad3744ea9d0ff7bf93652a21542d2b2cf
Reviewed-on: http://gerrit.cloudera.org:8080/21043
Reviewed-by: Michael Smith 
Tested-by: Joe McDonnell 
---
M buildall.sh
A source/mold/build.sh
2 files changed, 44 insertions(+), 0 deletions(-)

Approvals:
  Michael Smith: Looks good to me, approved
  Joe McDonnell: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ifc1ad3744ea9d0ff7bf93652a21542d2b2cf
Gerrit-Change-Number: 21043
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 


[native-toolchain-CR] IMPALA-12807: Add mold linker

2024-03-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21043 )

Change subject: IMPALA-12807: Add mold linker
..


Patch Set 2: Verified+1

Thanks for the review, going ahead with this (builds pass on x86_64 and ARM)


--
To view, visit http://gerrit.cloudera.org:8080/21043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc1ad3744ea9d0ff7bf93652a21542d2b2cf
Gerrit-Change-Number: 21043
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 12 Mar 2024 19:28:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-12 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21141 )

Change subject: WIP IMPALA-12896: Avoid JDBC table to be set as transactional 
table
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@809
PS1, Line 809: Exception
Why this is widen from only catching IOException to all Exception?


http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
File fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java:

http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@36
PS1, Line 36: // Max number of rows processed across all instances of a plan 
node.
:   private long maxRowsProcessed_ ;
:
:   // Max number of rows processed per backend impala daemon for a 
plan node.
:   private long maxRowsProcessedPerNode_;
This a good chance to explicitly init these to 0.


http://gerrit.cloudera.org:8080/#/c/21141/1/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@49
PS1, Line 49: // Operations on DataSourceScanNode are processed on coordinator.
:   maxRowsProcessed_ = Math.max(maxRowsProcessed_, 0);
:   maxRowsProcessedPerNode_ = 
Math.max(maxRowsProcessedPerNode_, 0);
Is this basically intended to be a no-op?
If yes, probably better to leave it empty and just add Preconditions that 
numNodes == 1.

Just like the very last else branch, this should check if caller is valid and 
return early when it is invalid.

  if ((in == -1) || (out == -1)) {
valid_ = false;
return;
  }

At the end of this method, you can add instead add Preconditions that verify 
both maxRowsProcessed_ and maxRowsProcessedPerNode_ are always >= 0 after each 
visit.


http://gerrit.cloudera.org:8080/#/c/21141/1/tests/custom_cluster/test_ext_data_sources.py
File tests/custom_cluster/test_ext_data_sources.py:

http://gerrit.cloudera.org:8080/#/c/21141/1/tests/custom_cluster/test_ext_data_sources.py@51
PS1, Line 51: vector.get_value('exec_option')
Use deepcopy


http://gerrit.cloudera.org:8080/#/c/21141/1/tests/custom_cluster/test_ext_data_sources.py@254
PS1, Line 254: vector.get_value('exec_option')
Use deepcopy



--
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 12 Mar 2024 19:25:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21141 )

Change subject: WIP IMPALA-12896: Avoid JDBC table to be set as transactional 
table
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15488/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 12 Mar 2024 19:23:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-12896: Avoid JDBC table to be set as transactional table

2024-03-12 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21141


Change subject: WIP IMPALA-12896: Avoid JDBC table to be set as transactional 
table
..

WIP IMPALA-12896: Avoid JDBC table to be set as transactional table

In some deployment environment, JDBC tables are set as transactional
tables by default. This causes catalogd failed to load the metadata for
JDBC tables. This patch explicitly add table properties with
"transactional=false" for JDBC table to avoid the JDBC to be set as
transactional table.

The operations on JDBC table are processed only on coordinator. The
processed rows should be estimated as 0 for DataSourceScanNode by
planner so that coordinator-only query plans are generated for simple
queries on JDBC tables and queries could be executed without invoking
executor nodes.

Updates FileSystemUtil.copyFileFromUriToLocal() function to write log
message for all types of exceptions.

Testing:
 - TODO: update planer tests for data source tables.
 - TODO: pass core -tests.

Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
---
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
M tests/custom_cluster/test_ext_data_sources.py
M tests/query_test/test_ext_data_sources.py
5 files changed, 22 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/21141/1
--
To view, visit http://gerrit.cloudera.org:8080/21141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Gerrit-Change-Number: 21141
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21122 )

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10373/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 12 Mar 2024 18:57:52 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS

2024-03-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/2 )

Change subject: IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS
..

IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS

The TPC-H and TPC-DS builds currently do not respect the
CFLAGS environment variable, so they don't incorporate the
values that we set in init-compiler.sh.

This modifies the build scripts for TPC-H and TPC-DS to
patch their makefiles to add our CFLAGS. This has the
side effect of turning on -O3 optimization, resulting
in faster binaries used to generate the TPC-H and
TPC-DS datasets:

TPC-H's dbgen at scale 42:
Unoptimized: 4m46.269s
Optimized: 3m46.379s

TPC-DS's dsdgen at scale 20:
Unoptimized: 9m41.441s
Optimized: 7m25.017s

Testing:
 - Ran a build and verified that the flags include our
   CFLAGS value

Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a
Reviewed-on: http://gerrit.cloudera.org:8080/2
Reviewed-by: Michael Smith 
Tested-by: Joe McDonnell 
---
M source/tpc-ds/build.sh
M source/tpc-h/build.sh
2 files changed, 14 insertions(+), 0 deletions(-)

Approvals:
  Michael Smith: Looks good to me, approved
  Joe McDonnell: Verified

--
To view, visit http://gerrit.cloudera.org:8080/2
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a
Gerrit-Change-Number: 2
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 


[native-toolchain-CR] IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS

2024-03-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/2 )

Change subject: IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS
..


Patch Set 2: Verified+1

Thanks for the review, going ahead with this.


--
To view, visit http://gerrit.cloudera.org:8080/2
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a
Gerrit-Change-Number: 2
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 12 Mar 2024 18:42:28 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS

2024-03-12 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/2 )

Change subject: IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS
..


Patch Set 2: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-ds/build.sh
File source/tpc-ds/build.sh:

http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-ds/build.sh@38
PS1, Line 38:   sed -i -r "s/^(LINUX_CFLAGS\s*=.*)/\1 ${CFLAGS}/" makefile
> Yes, I think this is going to accumulate if we run it multiple times. It's
Ack


http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-h/build.sh
File source/tpc-h/build.sh:

http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-h/build.sh@50
PS1, Line 50:   # The build makefile doesn't respect the CFLAGS environment 
variable directly,
> Synced these up
Done



--
To view, visit http://gerrit.cloudera.org:8080/2
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a
Gerrit-Change-Number: 2
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 12 Mar 2024 18:41:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters

2024-03-12 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: IMPALA-11499: Refactor UrlEncode function to handle special 
characters
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc
File be/src/util/coding-util.cc:

http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc@84
PS1, Line 84: // Encode forward slashes to prevent misinterpretation in 
paths or URLs
> Yes, that's correct. The UrlEncode function in the provided code is designe
As Daniel said, please add a clearer description of the problem and solution to 
the commit message. What you just said here is a good basis for that.



--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Tue, 12 Mar 2024 18:40:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-12 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21122 )

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21122/3/be/src/util/cache/cache-test.cc
File be/src/util/cache/cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/21122/3/be/src/util/cache/cache-test.cc@180
PS3, Line 180:   // Updating the charge for the cache entry evicts something
> Oops, fixed
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 12 Mar 2024 18:38:04 +
Gerrit-HasComments: Yes


[native-toolchain-CR] IMPALA-12807: Add mold linker

2024-03-12 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21043 )

Change subject: IMPALA-12807: Add mold linker
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc1ad3744ea9d0ff7bf93652a21542d2b2cf
Gerrit-Change-Number: 21043
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 12 Mar 2024 18:30:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21122 )

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15487/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 12 Mar 2024 18:12:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21122 )

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15486/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 12 Mar 2024 18:08:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-12 Thread Joe McDonnell (Code Review)
Hello Kurt Deschler, Yida Wu, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21122

to look at the new patch set (#5).

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..

IMPALA-12883: Support updating the charge on an entry in the cache

The cache implementation currently assumes that a cache
entry's charge will remain constant over time and does
not support updating the charge. This works well for
existing cache users like the data cache and codegen
cache. However, it doesn't work as well for use cases
where the final size is not known up front. Being able
to update the charge also allows for avoiding
concurrency issues where two different threads are
trying to insert the same entry.

This adds the ability to update an entry's charge after
it has been inserted into the cache. This can trigger
evictions if the size increases. This also adds a way
to retrieve the maximum charge supported by the cache.
This allows a cache user to tune its behavior to avoid
generating entries that would exceed the maximum charge.

Testing:
 - Added tests cases to the caching backend tests

Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
---
M be/src/util/cache/cache-internal.h
M be/src/util/cache/cache-test.cc
M be/src/util/cache/cache-test.h
M be/src/util/cache/cache.h
M be/src/util/cache/lirs-cache-test.cc
M be/src/util/cache/lirs-cache.cc
M be/src/util/cache/rl-cache-test.cc
M be/src/util/cache/rl-cache.cc
8 files changed, 439 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/21122/5
--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21122 )

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21122/4/be/src/util/cache/lirs-cache.cc
File be/src/util/cache/lirs-cache.cc:

http://gerrit.cloudera.org:8080/#/c/21122/4/be/src/util/cache/lirs-cache.cc@865
PS4, Line 865: HandleBase* LIRSCacheShard::Allocate(Slice key, uint32_t hash, 
int val_len, size_t charge) {
> line too long (92 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 12 Mar 2024 17:47:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-12 Thread Joe McDonnell (Code Review)
Hello Kurt Deschler, Yida Wu, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21122

to look at the new patch set (#4).

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..

IMPALA-12883: Support updating the charge on an entry in the cache

The cache implementation currently assumes that a cache
entry's charge will remain constant over time and does
not support updating the charge. This works well for
existing cache users like the data cache and codegen
cache. However, it doesn't work as well for use cases
where the final size is not known up front. Being able
to update the charge also allows for avoiding
concurrency issues where two different threads are
trying to insert the same entry.

This adds the ability to update an entry's charge after
it has been inserted into the cache. This can trigger
evictions if the size increases. This also adds a way
to retrieve the maximum charge supported by the cache.
This allows a cache user to tune its behavior to avoid
generating entries that would exceed the maximum charge.

Testing:
 - Added tests cases to the caching backend tests

Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
---
M be/src/util/cache/cache-internal.h
M be/src/util/cache/cache-test.cc
M be/src/util/cache/cache-test.h
M be/src/util/cache/cache.h
M be/src/util/cache/lirs-cache-test.cc
M be/src/util/cache/lirs-cache.cc
M be/src/util/cache/rl-cache-test.cc
M be/src/util/cache/rl-cache.cc
8 files changed, 438 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/21122/4
--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21122 )

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..


Patch Set 3:

(2 comments)

I did one other change in the new upload: charge becomes a size_t in all 
interfaces (which means kAutomaticCharge is a size_t and set to the max val). 
That avoids issues about charge being negative, etc.

http://gerrit.cloudera.org:8080/#/c/21122/3/be/src/util/cache/cache-test.cc
File be/src/util/cache/cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/21122/3/be/src/util/cache/cache-test.cc@180
PS3, Line 180:   // Updating the charge for the erased element evicts something
> Why do you call h1 the erased element?
Oops, fixed


http://gerrit.cloudera.org:8080/#/c/21122/3/be/src/util/cache/lirs-cache-test.cc
File be/src/util/cache/lirs-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/21122/3/be/src/util/cache/lirs-cache-test.cc@290
PS3, Line 290:   cache_->UpdateCharge(handle, cache_->MaxCharge());
> Do you want to validate the handle's charge doesn't change too?
Added the ability to get the charge for a handle and checked that it doesn't 
change.



--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 12 Mar 2024 17:44:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21122 )

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21122/4/be/src/util/cache/lirs-cache.cc
File be/src/util/cache/lirs-cache.cc:

http://gerrit.cloudera.org:8080/#/c/21122/4/be/src/util/cache/lirs-cache.cc@865
PS4, Line 865: HandleBase* LIRSCacheShard::Allocate(Slice key, uint32_t hash, 
int val_len, size_t charge) {
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 12 Mar 2024 17:45:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-12 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg 
tables
..


Patch Set 1: Code-Review+1

(2 comments)

You mentioned offline you're about to upload a test table for this. Otherwise 
LGTM!

http://gerrit.cloudera.org:8080/#/c/21139/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21139/1//COMMIT_MSG@12
PS1, Line 12: login
> What does "login" mean here?
logic?


http://gerrit.cloudera.org:8080/#/c/21139/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/21139/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@1473
PS1, Line 1473: o
nit: missing space



--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 12 Mar 2024 17:20:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12818: Intermediate Result Caching plan node framework

2024-03-12 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21035 )

Change subject: IMPALA-12818: Intermediate Result Caching plan node framework
..


Patch Set 5: Code-Review+1

(3 comments)

Looks good for an initial patch. Could be +2 if no other major issues

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java@530
PS3, Line 530:  have not been calcula
> Refactored this to be a ThriftSerializationCtx that is passed in. This code
Done


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java@1237
PS3, Line 1237:
> Reworked this comment to describe how this is a bottom-up tree traversal.
Done


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java
File fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java:

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java@59
PS3, Line 59:
> Changed this to use StringBuilder. I also added an explicit finalize() call
Done



--
To view, visit http://gerrit.cloudera.org:8080/21035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1f36a87dcce6efd5d1e1f0bc04009bf009b1961
Gerrit-Change-Number: 21035
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 12 Mar 2024 17:14:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-12 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg 
tables
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21139/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21139/1//COMMIT_MSG@12
PS1, Line 12: login
What does "login" mean here?



--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 12 Mar 2024 17:12:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21139 )

Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg 
tables
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15485/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 12 Mar 2024 17:08:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters

2024-03-12 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: IMPALA-11499: Refactor UrlEncode function to handle special 
characters
..


Patch Set 3:

(9 comments)

Thanks, Pranav!

http://gerrit.cloudera.org:8080/#/c/21131/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21131/1//COMMIT_MSG@9
PS1, Line 9: In this commit, a comprehensive mapping of special characters, 
including
You should explain what the error in the previous version was and what failures 
it lead to.


http://gerrit.cloudera.org:8080/#/c/21131/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21131/3//COMMIT_MSG@9
PS3, Line 9: In this commit, a comprehensive mapping of special characters, 
including
The commit message should explain what errors the code before the change 
caused, what was wrong with the code and how it is fixed by this change.


http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc
File be/src/util/coding-util.cc:

http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@48
PS3, Line 48: std::unordered_map specialCharacterMap = {
Should be const.


http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@71
PS3, Line 71:   (*out).reserve(in_len * 3);  // Maximum expansion per character 
is 3
I'm not sure why we have to reserve this. 'input_str' is modified (grown) in 
place. At the end of the function we could move 'input_str' into '*out' like 
this:

(*out) = std::move(input_str);

You should include  for this overload of std::move.

Or, alternatively, you could discard 'input_str' completely and start by 
copying the input into '*out' and use that variable in the string replacements.


http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@78
PS3, Line 78:   for (const auto& entry : specialCharacterMap) {
Won't this replace existing % signs twice? First on L75, then in the loop 
because '%' is present as a key in specialCharacterMap.

We could either remove '%' from the map or
 1. Replace the map with an ordered std::array>.
 2. Put '%' first in that std::array.


http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@83
PS3, Line 83:   if (!hive_compat) {
Are these additional encodings the same that were done before this change? It 
doesn't seem like that. For example, I think "$" was encoded before but not now.
If not, won't it cause problems? The commit message should at least mention how 
the behaviour changes.


http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@85
PS3, Line 85: boost::replace_all(input_str, "/", "%2F");
Forward slashes have already been encoded because '/' is contained in 
specialCharacterMap. If it are not to be encoded in Hive-compat mode, it should 
be removed from the map.


http://gerrit.cloudera.org:8080/#/c/21131/3/be/src/util/coding-util.cc@90
PS3, Line 90: (*out)
This only makes sense after assigning 'input_str' to 'out'.
Also, could use out->shrink_to_fit().


http://gerrit.cloudera.org:8080/#/c/21131/3/testdata/workloads/functional-query/queries/QueryTest/unicode-column-name.test
File 
testdata/workloads/functional-query/queries/QueryTest/unicode-column-name.test:

http://gerrit.cloudera.org:8080/#/c/21131/3/testdata/workloads/functional-query/queries/QueryTest/unicode-column-name.test@298
PS3, Line 298: my_part_tbl
A more descriptive name would be better, for example "unicode_partition_values".



--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Tue, 12 Mar 2024 16:53:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

2024-03-12 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21139


Change subject: IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg 
tables
..

IMPALA-12894: Turn off the count(*) optimisation for V2 Iceberg tables

This is a part 1 change that turns off the count(*) optimisations for
V2 tables as there is a correctness issue with it. The reason is that
Spark compaction may leave some dangling delete files that messes up
the login in Impala.

Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables-hash-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test
M tests/query_test/test_iceberg.py
6 files changed, 408 insertions(+), 244 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/21139/1
--
To view, visit http://gerrit.cloudera.org:8080/21139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Gerrit-Change-Number: 21139
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 


[native-toolchain-CR] IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS

2024-03-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/2 )

Change subject: IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-ds/build.sh
File source/tpc-ds/build.sh:

http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-ds/build.sh@38
PS1, Line 38:   sed -i -r "s/^(LINUX_CFLAGS\s*=.*)/\1 ${CFLAGS}/" makefile
> Could this be applied repeatedly on repeated builds? I don't think it would
Yes, I think this is going to accumulate if we run it multiple times. It's 
messy, but I want our flags at the end so they override anything else already 
specified.

Added a comment for this.


http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-h/build.sh
File source/tpc-h/build.sh:

http://gerrit.cloudera.org:8080/#/c/2/1/source/tpc-h/build.sh@50
PS1, Line 50:   # The build makefile doesn't respect the CFLAGS environment 
variable directly,
> Why does line wrapping differ between this and the other file?
Synced these up



--
To view, visit http://gerrit.cloudera.org:8080/2
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a
Gerrit-Change-Number: 2
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 12 Mar 2024 16:32:15 +
Gerrit-HasComments: Yes


[native-toolchain-CR] IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS

2024-03-12 Thread Joe McDonnell (Code Review)
Hello Michael Smith,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/2

to look at the new patch set (#2).

Change subject: IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS
..

IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS

The TPC-H and TPC-DS builds currently do not respect the
CFLAGS environment variable, so they don't incorporate the
values that we set in init-compiler.sh.

This modifies the build scripts for TPC-H and TPC-DS to
patch their makefiles to add our CFLAGS. This has the
side effect of turning on -O3 optimization, resulting
in faster binaries used to generate the TPC-H and
TPC-DS datasets:

TPC-H's dbgen at scale 42:
Unoptimized: 4m46.269s
Optimized: 3m46.379s

TPC-DS's dsdgen at scale 20:
Unoptimized: 9m41.441s
Optimized: 7m25.017s

Testing:
 - Ran a build and verified that the flags include our
   CFLAGS value

Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a
---
M source/tpc-ds/build.sh
M source/tpc-h/build.sh
2 files changed, 14 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/11/2/2
--
To view, visit http://gerrit.cloudera.org:8080/2
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a
Gerrit-Change-Number: 2
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner can be scheduled 
to executors
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15484/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 12 Mar 2024 16:12:52 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12807: Add mold linker

2024-03-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21043 )

Change subject: IMPALA-12807: Add mold linker
..


Patch Set 2:

Tested this with a toolchain build and with Impala using this upstream change: 
https://gerrit.cloudera.org/#/c/21121/


--
To view, visit http://gerrit.cloudera.org:8080/21043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc1ad3744ea9d0ff7bf93652a21542d2b2cf
Gerrit-Change-Number: 21043
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 12 Mar 2024 16:02:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-03-12 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 3: Code-Review+1

+1 from my part but I'm not familiar with this part of the code, so someone 
else should also see it.


--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 12 Mar 2024 15:52:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors

2024-03-12 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner can be scheduled 
to executors
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21138/1/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/21138/1/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@709
PS1, Line 709: [%s]
If 'mustRunOnCoord_' is true, do you think we should mention it here? It may be 
misleading because a fragment can be coord-only for other reasons as well, but 
the lack of the info could be taken to mean the fragment is not coord-only.



--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Comment-Date: Tue, 12 Mar 2024 15:49:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors

2024-03-12 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21138


Change subject: IMPALA-12809: Iceberg metadata table scanner can be scheduled 
to executors
..

IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors

On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) can be scheduled to executors, for example
during a join. The fragment in this case will fail a precondition check,
because either the 'frontend_' object or the table will not be present.

This change forces Iceberg metadata scanner fragments to be scheduled on
the coordinator. It is not enough to set the DataPartition type to
UNPARTITIONED, because unpartitioned fragments can still be scheduled on
executors. This change introduces a new flag in the TPlanFragment thrift
struct - if it is true, the fragment is always scheduled on the
coordinator.

Testing:
 - Added a regression test in test_coordinators.py.

Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
---
M be/src/scheduling/scheduler.cc
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M tests/custom_cluster/test_coordinators.py
5 files changed, 45 insertions(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/21138/1
-- 
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker 


[Impala-ASF-CR] IMPALA-12809: Iceberg metadata table scanner can be scheduled to executors

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21138 )

Change subject: IMPALA-12809: Iceberg metadata table scanner can be scheduled 
to executors
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21138/1/tests/custom_cluster/test_coordinators.py
File tests/custom_cluster/test_coordinators.py:

http://gerrit.cloudera.org:8080/#/c/21138/1/tests/custom_cluster/test_coordinators.py@304
PS1, Line 304: ;
flake8: E703 statement ends with a semicolon



--
To view, visit http://gerrit.cloudera.org:8080/21138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Gerrit-Change-Number: 21138
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 12 Mar 2024 15:50:24 +
Gerrit-HasComments: Yes


[native-toolchain-CR] IMPALA-12886: Bump GoogleTest version to 1.14.0

2024-03-12 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21133 )

Change subject: IMPALA-12886: Bump GoogleTest version to 1.14.0
..


Patch Set 1:

> Patch Set 1:
>
> Thanks Laszlo for working on this!
>
> On my Ubuntu 20.04.6 LTS I had to add -fPIC to CXXFLAGS otherwise I got 
> strange linker errors. Though I switched to GoogleTest 1.14.0 in a quite 
> hacky way, so maybe it was just because of that. Did you try this out on 
> Ubuntu 20?

Not yet (although I definitely want to run it through some upstream-like 
testing), but it's a good point, thanks for calling it out.


--
To view, visit http://gerrit.cloudera.org:8080/21133
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa52b4809a7c969e863518b182d2df63256f4c43
Gerrit-Change-Number: 21133
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 12 Mar 2024 14:42:44 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12697: Set FAIL ON PUBLISH to true by default

2024-03-12 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21134 )

Change subject: IMPALA-12697: Set FAIL_ON_PUBLISH to true by default
..


Patch Set 1: Code-Review+2

LGTM!


--
To view, visit http://gerrit.cloudera.org:8080/21134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I156269e8b3b1fa5d743a8ab5a83810001f7dd648
Gerrit-Change-Number: 21134
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 12 Mar 2024 14:37:28 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12886: Bump GoogleTest version to 1.14.0

2024-03-12 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21133 )

Change subject: IMPALA-12886: Bump GoogleTest version to 1.14.0
..


Patch Set 1:

Thanks Laszlo for working on this!

On my Ubuntu 20.04.6 LTS I had to add -fPIC to CXXFLAGS otherwise I got strange 
linker errors. Though I switched to GoogleTest 1.14.0 in a quite hacky way, so 
maybe it was just because of that. Did you try this out on Ubuntu 20?


--
To view, visit http://gerrit.cloudera.org:8080/21133
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa52b4809a7c969e863518b182d2df63256f4c43
Gerrit-Change-Number: 21133
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 12 Mar 2024 14:36:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-03-12 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21118/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21118/2//COMMIT_MSG@16
PS2, Line 16: alls
> Nit: calls.
Done


http://gerrit.cloudera.org:8080/#/c/21118/2//COMMIT_MSG@16
PS2, Line 16: that
> Nit: that?
Done


http://gerrit.cloudera.org:8080/#/c/21118/2//COMMIT_MSG@18
PS2, Line 18: evaluated
> Nit: is evaluated?
Done


http://gerrit.cloudera.org:8080/#/c/21118/2//COMMIT_MSG@19
PS2, Line 19: filter
> Nit: filters.
Done



--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 12 Mar 2024 13:53:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-03-12 Thread Riza Suminto (Code Review)
Hello Daniel Becker, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21118

to look at the new patch set (#3).

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..

IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

IMPALA-12018 adds reduceCardinalityForScanNode to lower cardinality
estimation when runtime filter is involved. It calls
JoinNode.computeGenericJoinCardinality(). However, if the originating
join node has FK-PK conjunct, it should be possible to obtain lower
cardinality estimate by calling JoinNode.getFkPkJoinCardinality()
instead.

This patch adds that analysis and calls
JoinNode.getFkPkJoinCardinality() when possible. It is, however, only
limited for runtime filter that evaluated on storage layer to avoid
severe underestimation. Runtime filters that evaluate at scan node at
row level will keep using JoinNode.computeGenericJoinCardinality().

Testing:
- Update TpcdsCpuCostPlannerTest.

Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
---
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q18.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q19.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q25.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q26.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q27.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q29.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q33.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q34.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q42.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q45.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q46.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q48.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q49.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q52.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q53.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q55.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q56.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q60.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q61.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q63.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q64.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q66.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q68.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q72.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q73.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q79.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q89.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q91.test
39 files changed, 1,122 insertions(+), 1,095 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/21118/3
--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: 

[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21029 )

Change subject: IMPALA-12543: Detect self-events before finishing DDL
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15483/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21029
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79
Gerrit-Change-Number: 21029
Gerrit-PatchSet: 16
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 12 Mar 2024 13:10:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL

2024-03-12 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21029 )

Change subject: IMPALA-12543: Detect self-events before finishing DDL
..


Patch Set 16:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/21029/14//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21029/14//COMMIT_MSG@48
PS14, Line 48: Testing:
> Currently, we just have flaky tests like test_iceberg_self_events that can
Added TestEventProcessingWithImpala.


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1203
PS14, Line 1203:
   :   /**
> Let's fix this as well, i.e. only log this when the version is added for th
Done


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1054
PS14, Line 1054: CatalogServiceCatalog catalog, Table table, long 
newVersionNumber) {
> 'isInsertEvent' is always used as false. I think we can't track the infligh
Done


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1080
PS14, Line 1080:  * Cancel the new version number registration out of 
table's in-flight events.
> nit: please add an error message.
Done


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1221
PS14, Line 1221:
   :   responseSum
> nit: "Update the table object with the new catalog version"
Done


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1248
PS14, Line 1248:
   : case DROP_PARTITION:
   :   TAlterTableDropPartitionParams dropPartPa
> nit: this comment is stale
Removed.


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1376
PS14, Line 1376:
> nit: maybe 'modification.isInProgress()' sounds better
Done


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1592
PS14, Line 1592: if (!needsToUpdateHms) {
> Can we remove this if the Iceberg operations won't update HMS? It also seem
This is still needed, because ALTER_TABLE operation via Iceberg's HiveCatalog 
can produce event as well.
What needsToUpdateHms means is whether caller of alterIcebergTable() need to 
follow up with separate HMS alter table call that is not through Iceberg 
library.


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2650
PS14, Line 2650:   if (partition.getPartitionStatsCompressed() != null) 
{
> This is unchanged yet.
Changed to use InProgressTableModification.


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3239
PS14, Line 3239:   try {
> This is also unchanged but it's a bit complex - we are not always triggerin
Changed all three truncate* function to return InProgressTableModification.
This truncateTable() table only need to call 
markInflightEventRegistrationComplete().


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5512
PS14, Line 5512: String.format(HMS_RPC_ERROR_FORMAT_STR, 
"alter_table"), e);
> This is also unchanged for RENAME but I think we can't do it before the HMS
Add TODO for future investigation.


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7242
PS14, Line 7242:
> This is also unchanged for inserting Iceberg tables.
Changed to use InProgressTableModification.



--
To view, visit http://gerrit.cloudera.org:8080/21029
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79
Gerrit-Change-Number: 21029
Gerrit-PatchSet: 16
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 12 Mar 2024 12:58:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL

2024-03-12 Thread Riza Suminto (Code Review)
Hello Quanlong Huang, Jason Fehr, Sai Hemanth Gantasala, Csaba Ringhofer, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21029

to look at the new patch set (#16).

Change subject: IMPALA-12543: Detect self-events before finishing DDL
..

IMPALA-12543: Detect self-events before finishing DDL

test_iceberg_self_events has been flaky for not having
tbls_refreshed_before equal to tbls_refreshed_after in-between query
executions.

Further investigation reveals concurrency bug due to db/table level
lock is not taken during db/table self-events check (IMPALA-12461
part1). The order of ALTER TABLE operation is as follow:

1. alter table starts in CatalogOpExecutor
2. table level lock is taken
3. HMS RPC starts (CatalogOpExecutor.applyAlterTable())
4. HMS generates the event
5. HMS RPC returns
6. table is reloaded
7. catalog version is added to inflight event list
8. table level lock is releases

Meanwhile the event processor thread fetches the new event after 4 and
before 7. Because of IMPALA-12461 (part 1), it can also finish
self-events checking before reaching 7. Before IMPALA-12461, self-events
would have needed to wait for 8. Note that this issue is only relevant
for table level events, as self-events checking for partition level
events still takes table lock.

This patch fix the issue by adding newCatalogVersion to the table's
inflight event list before updating HMS. If HMS update does not
complete (ie., an exception is thrown), the new newCatalogVersion that
was added is then removed.

This patch also fix few smaller issues, including:
- Avoid incrementing EVENTS_SKIPPED_METRIC if numFilteredEvents == 0 in
  MetastoreEventFactory.getFilteredEvents().
- Increment EVENTS_SKIPPED_METRIC in
  MetastoreTableEvent.reloadTableFromCatalog() if table is already in
  the middle of reloading (revealed through flaky
  test_skipping_older_events).
- Rephrase misleading log message in
  MetastoreEventProcessor.getNextMetastoreEvents().

Testing:
- Add TestEventProcessingWithImpala, run it with debug_action
  and sync_ddl dimensions.
- Pass exhaustive tests.

Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M tests/custom_cluster/test_events_custom_configs.py
9 files changed, 757 insertions(+), 510 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/21029/16
--
To view, visit http://gerrit.cloudera.org:8080/21029
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79
Gerrit-Change-Number: 21029
Gerrit-PatchSet: 16
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21072 )

Change subject: IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by 
concurrent modification
..

IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent 
modification

HdfsTable.toMinimalTCatalogObject() is not always invoked with holding
the table lock, e.g. in invalidating a table, we could replace an
HdfsTable instance with an IncompleteTable instance. We then invoke
HdfsTable.toMinimalTCatalogObject() to get the removed catalog object.
However, the HdfsTable instance could be modified in the meantime by a
concurrent DDL/DML that would reload it, e.g. a REFRESH statement. This
causes HdfsTable.toMinimalTCatalogObject() failed by
ConcurrentModificationException on the column/partition list.

This patch fixes the issue by try acquiring the table read lock in
HdfsTable.toMinimalTCatalogObject(). If it fails, the partition ids and
names won't be returned. Also updates the method to not collecting the
column list since it's unused.

Tests
 - Added e2e test
 - Ran CORE tests

Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3
Reviewed-on: http://gerrit.cloudera.org:8080/21072
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/custom_cluster/test_concurrent_ddls.py
3 files changed, 70 insertions(+), 15 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21072
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3
Gerrit-Change-Number: 21072
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21072 )

Change subject: IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by 
concurrent modification
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21072
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3
Gerrit-Change-Number: 21072
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 12 Mar 2024 12:06:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-12 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1785
PS4, Line 1785: skipFileMetadata
nit: "skipFileMetadataReload" sounds better


http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1792
PS4, Line 1792:   if (!(isNonTrivialSdPropsChanged(beforeTable.getSd(), 
afterTable.getSd( {
  : skipFileMetadata = true;
  :   } else {
  : skipFileMetadata = false;
  :   }
This means skipFileMetadata = !isNonTrivialSdPropsChanged() regardless of what 
its previous value is.

When the table owner changed and there are only trivial SD props changed, 
shouldn't we have skipFileMetadata = true? The current code set 
skipFileMetadata to false finally.


http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1860
PS4, Line 1860: file metadata reload can be skipped
Shouldn't this be "file metadata should be reloaded"?


http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1873
PS4, Line 1873: table schema reload can be skipped
nit: "file metadata should be reloaded"


http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@a3035
PS4, Line 3035:
Why do we remove existing tests in testAlterTableNoFileMetadataReload()?



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 12 Mar 2024 08:09:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21072 )

Change subject: IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by 
concurrent modification
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10372/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21072
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3
Gerrit-Change-Number: 21072
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 12 Mar 2024 07:19:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification

2024-03-12 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21072 )

Change subject: IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by 
concurrent modification
..


Patch Set 4:

> Patch Set 4: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10371/

The failure is unrelated: IMPALA-11396
Rerun the job.


--
To view, visit http://gerrit.cloudera.org:8080/21072
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3
Gerrit-Change-Number: 21072
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 12 Mar 2024 07:19:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters

2024-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: IMPALA-11499: Refactor UrlEncode function to handle special 
characters
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15482/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Tue, 12 Mar 2024 06:22:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters

2024-03-12 Thread Anonymous Coward (Code Review)
pranav.lo...@cloudera.com has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: IMPALA-11499: Refactor UrlEncode function to handle special 
characters
..


Patch Set 3:

(4 comments)

> Uploaded patch set 3.

http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc
File be/src/util/coding-util.cc:

http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc@47
PS1, Line 47: // Mapping of special characters to their URL-encoded forms
> With this change it doesn't look like we need this list or ShouldNotEscape
Done


http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc@75
PS1, Line 75:   boost::replace_all(input_str, "%", "%25");
> line too long (94 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc@84
PS1, Line 84: // Encode forward slashes to prevent misinterpretation in 
paths or URLs
> So this works because it only replaces specific chars - that won't appear i
Yes, that's correct. The UrlEncode function in the provided code is designed to 
work with UTF-8 encoded strings, where each character can be represented by one 
to four bytes. The key is that it replaces specific characters with their 
URL-encoded forms, and these replacements are carefully chosen to avoid 
interfering with multi-byte characters.The replacements are safe because they 
don't disrupt the structure of multi-byte characters in UTF-8 encoding. The % 
sign followed by two hexadecimal digits, used in URL encoding, ensures that the 
replacements won't be confused with the actual encoding of multi-byte 
characters.

On the other hand, using a generic isalnum check to determine whether a 
character should be replaced might not be suitable for UTF-8 encoded strings. 
It could inadvertently affect multi-byte characters, potentially leading to 
incorrect or corrupted results.


http://gerrit.cloudera.org:8080/#/c/21131/1/be/src/util/coding-util.cc@95
PS1, Line 95: void UrlEncode(const vector& in, string* out, bool 
hive_compat) {
> We could return unused capacity with https://en.cppreference.com/w/cpp/stri
Done



--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Tue, 12 Mar 2024 05:56:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11499: Refactor UrlEncode function to handle special characters

2024-03-12 Thread Anonymous Coward (Code Review)
pranav.lo...@cloudera.com has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: IMPALA-11499: Refactor UrlEncode function to handle special 
characters
..

IMPALA-11499: Refactor UrlEncode function to handle special characters

In this commit, a comprehensive mapping of special characters, including
Unicode, to their respective URL-encoded forms has been introduced. The
UrlEncode function has been modified to dynamically iterate through
this mapping, providing a more generic and scalable approach to special
character encoding. This change also enhances code readability and
maintainability by incorporating detailed comments explaining each
section of the updated implementation.

Testing: Some basic tests are provided in unicode-column-name.test.

Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
---
M be/src/util/coding-util.cc
M testdata/workloads/functional-query/queries/QueryTest/unicode-column-name.test
2 files changed, 78 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/21131/3
--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zihao Ye