[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15526/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 7
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Fri, 15 Mar 2024 05:34:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-14 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG@14
PS6, Line 14: . The
> nit: remove "=true" ?
Done


http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG@15
PS6, Line 15:
:
> nit: this is stale now
Ack


http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1936
PS6, Line 1936: "newOwner: {}. So file metadata reload can be 
skipped.",
> nit: let's also add "So file metadata should be reloaded" here.
Done


http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1971
PS6, Line 1971:   return false;
> So we return false for other unknown cases. In the future if a new field is
In the latest patch, I have addressed this comment. Please let me know if that 
looks good.


http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3050
PS6, Line 3050: assertNotEquals(fileMetadataLoadAfter, 
fileMetadataLoadBefore);
> Why do we change this? fileMetadataLoadAfter >= fileMetadataLoadBefore is a
issue with reverting back to old code. Will do it properly in the next patch 
set.



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 7
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Fri, 15 Mar 2024 05:10:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-14 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21019

to look at the new patch set (#7).

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..

IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories. The file metadata reload can be
skipped for all other changes in SD.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
5 files changed, 159 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/7
--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 7
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12802: Support ALTER TABLE for JDBC tables

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21088 )

Change subject: IMPALA-12802: Support ALTER TABLE for JDBC tables
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15525/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21088
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5ebb5de2c686d2015db78641f78299dd5f33621e
Gerrit-Change-Number: 21088
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Fri, 15 Mar 2024 01:09:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12802: Support ALTER TABLE for JDBC tables

2024-03-14 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21088 )

Change subject: IMPALA-12802: Support ALTER TABLE for JDBC tables
..


Patch Set 4:

(4 comments)

Thanks Joe for your feedback.

http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteExpireSnapshotsStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteExpireSnapshotsStmt.java:

http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteExpireSnapshotsStmt.java@49
PS3, Line 49: E
> Nit: extra space at the beginning
fixed


http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableSetStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterTableSetStmt.java:

http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableSetStmt.java@39
PS3, Line 39:   @Override
:   public void analyze(Analyzer analyzer) throws
> I think this class isn't used directly, so another option is for this to be
Changed the class as abstract class


http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java:

http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java@228
PS3, Line 228: change internal properties of D
> Nit: Maybe this should be "Cannot change internal properties of DataSource"
fixed the comments


http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableUnSetTblProperties.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableUnSetTblProperties.java:

http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableUnSetTblProperties.java@148
PS3, Line 148: unset internal properties of Da
> Nit: maybe say "internal properties" to emphasize that these are set by us.
fixed it.



--
To view, visit http://gerrit.cloudera.org:8080/21088
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5ebb5de2c686d2015db78641f78299dd5f33621e
Gerrit-Change-Number: 21088
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Fri, 15 Mar 2024 00:46:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12802: Support ALTER TABLE for JDBC tables

2024-03-14 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/21088 )

Change subject: IMPALA-12802: Support ALTER TABLE for JDBC tables
..

IMPALA-12802: Support ALTER TABLE for JDBC tables

IMPALA-12793 changes the syntax for creating JDBC table. The
configurations of connection credentials - url, username, password,
jdbc driver, etc, are set as table properties.

This patch allows user to change these table properties, or edit
columns via ALTER TABLE statement.

Testing:
 - Added frontend analysis unit-tests.
 - Added end-to-end unit-test.
 - Passed Core tests

Change-Id: I5ebb5de2c686d2015db78641f78299dd5f33621e
---
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropColStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteExpireSnapshotsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteRollbackStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewRenameStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewSetOwnerStmt.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableRecoverPartitionsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableReplaceColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetCachedStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetColumnStats.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetOwnerStmt.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableSetPartitionSpecStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetRowFormatStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSortByStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableUnSetTblProperties.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewSetOwnerStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewSetTblProperties.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewUnSetTblProperties.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/FeDataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDataSourceTable.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test
33 files changed, 452 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/21088/4
--
To view, visit http://gerrit.cloudera.org:8080/21088
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5ebb5de2c686d2015db78641f78299dd5f33621e
Gerrit-Change-Number: 21088
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 


[Impala-ASF-CR] IMPALA-12737: List key columns in query history

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21142 )

Change subject: IMPALA-12737: List key columns in query history
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15524/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
Gerrit-Change-Number: 21142
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 14 Mar 2024 23:57:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12626: Add Tables Queried to profile/history

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20886 )

Change subject: IMPALA-12626: Add Tables Queried to profile/history
..


Patch Set 18:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15523/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 18
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Thu, 14 Mar 2024 23:56:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12737: List key columns in query history

2024-03-14 Thread Michael Smith (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21142

to look at the new patch set (#5).

Change subject: IMPALA-12737: List key columns in query history
..

IMPALA-12737: List key columns in query history

Adds "Key Columns" to the query profile, enumerating a
comma-separated list of the canonical path of columns used in the query:

  Key Columns: functional.alltypes.month,functional.alltypes.year

Also adds "key_columns" to impala_query_log and impala_query_live with
the same content.

Requires 'drop table sys.impala_query_log' to recreate it with the new
column.

Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
---
M be/src/exec/system-table-scanner.cc
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-state-record.cc
M be/src/service/query-state-record.h
M be/src/service/workload-management-fields.cc
M common/thrift/Frontend.thrift
M common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/util/workload_management.py
14 files changed, 98 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/5
--
To view, visit http://gerrit.cloudera.org:8080/21142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
Gerrit-Change-Number: 21142
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12626: Add Tables Queried to profile/history

2024-03-14 Thread Michael Smith (Code Review)
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20886

to look at the new patch set (#18).

Change subject: IMPALA-12626: Add Tables Queried to profile/history
..

IMPALA-12626: Add Tables Queried to profile/history

Adds "Tables Queried" to the query profile, enumerating a
comma-separated list of tables accessed during a query:

  Tables Queried: tpch.customer,tpch.lineitem

Also adds "tables_queried" to impala_query_log and impala_query_live
with the same content.

Requires 'drop table sys.impala_query_log' to recreate it with the new
column.

Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
---
M be/src/exec/system-table-scanner.cc
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-state-record.cc
M be/src/service/query-state-record.h
M be/src/service/workload-management-fields.cc
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
M be/src/util/error-util.cc
M common/thrift/Frontend.thrift
M common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_query_live.py
M tests/util/workload_management.py
14 files changed, 72 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20886/18
--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 18
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12802: Support ALTER TABLE for JDBC tables

2024-03-14 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21088 )

Change subject: IMPALA-12802: Support ALTER TABLE for JDBC tables
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteExpireSnapshotsStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteExpireSnapshotsStmt.java:

http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableExecuteExpireSnapshotsStmt.java@49
PS3, Line 49:
Nit: extra space at the beginning


http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableSetStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterTableSetStmt.java:

http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableSetStmt.java@39
PS3, Line 39:   @Override
:   public String getOperation() { return "SET"; }
I think this class isn't used directly, so another option is for this to be an 
abstract class that doesn't specify this and require subclasses to implement it.


http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java:

http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java@228
PS3, Line 228: unset properties of DataSource.
Nit: Maybe this should be "Cannot change internal properties of DataSource"


http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableUnSetTblProperties.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableUnSetTblProperties.java:

http://gerrit.cloudera.org:8080/#/c/21088/3/fe/src/main/java/org/apache/impala/analysis/AlterTableUnSetTblProperties.java@148
PS3, Line 148: unset properties of DataSource.
Nit: maybe say "internal properties" to emphasize that these are set by us.



--
To view, visit http://gerrit.cloudera.org:8080/21088
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5ebb5de2c686d2015db78641f78299dd5f33621e
Gerrit-Change-Number: 21088
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 14 Mar 2024 23:27:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-14 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 49:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20770/49/tests/util/workload_management.py
File tests/util/workload_management.py:

http://gerrit.cloudera.org:8080/#/c/20770/49/tests/util/workload_management.py@277
PS49, Line 277: sql_stmt = re.search(r'\n\s+Query Options \(set by 
configuration\):\s+(.*?)\n',
nit: sql_stmt doesn't seem like the right variable name



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 49
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 23:18:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table

2024-03-14 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20762 )

Change subject: IMPALA-12540: Add system.impala_query_live table
..


Patch Set 33:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/20762/28/be/src/exec/system-table-scanner.h
File be/src/exec/system-table-scanner.h:

http://gerrit.cloudera.org:8080/#/c/20762/28/be/src/exec/system-table-scanner.h@24
PS28, Line 24: struct QueryStateExpanded;
> General strategy to keep compile times fast. I don't need to know anything
Thanks for the explanation!


http://gerrit.cloudera.org:8080/#/c/20762/28/be/src/exec/system-table-scanner.cc
File be/src/exec/system-table-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/20762/28/be/src/exec/system-table-scanner.cc@147
PS28, Line 147: for (auto event : query.EventsTimeline()) {
> I may try to re-use helpers from workload-management.h, or move this logic
It would be better in QueryStateExpanded so the complete queries code could 
take advantage of it.  I will look into putting it there.


http://gerrit.cloudera.org:8080/#/c/20762/28/be/src/service/query-state-record.h
File be/src/service/query-state-record.h:

http://gerrit.cloudera.org:8080/#/c/20762/28/be/src/service/query-state-record.h@318
PS28, Line 318:   const std::shared_ptr base_state_src = 
nullptr);
> I updated it to default initialize if it's null. Saves a little thought on
Ah, I understand now.  I did not realize the live query table does not have a 
QueryStateRecord.

My original comment should have said "QueryStateExpanded assumes base_state_src 
is not nullptr".  After looking at the code again, that comment is not 
accurate.  The code creates a new shared_ptr if 
base_state_src is null.  Thus, there is no possibility for nullptr related 
issues.


http://gerrit.cloudera.org:8080/#/c/20762/28/fe/src/main/java/org/apache/impala/catalog/SystemTable.java
File fe/src/main/java/org/apache/impala/catalog/SystemTable.java:

http://gerrit.cloudera.org:8080/#/c/20762/28/fe/src/main/java/org/apache/impala/catalog/SystemTable.java@51
PS28, Line 51:   public static final String QUERY_LIVE = "impala_query_live";
> I don't see a good reason to, since as you mention it doesn't create any ex
Done



--
To view, visit http://gerrit.cloudera.org:8080/20762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
Gerrit-Change-Number: 20762
Gerrit-PatchSet: 33
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Thu, 14 Mar 2024 22:20:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15522/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 21:36:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 49:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15521/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 49
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 21:27:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-14 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2846
PS1, Line 2846:   public int reloadPartitionsFromNames(long eventId, 
IMetaStoreClient client,
> Shouldn't the 'reason' already include the event type?
Reason: RELOAD event, Received partition with empty values: 
Partition(values:[]...).
Ignoring reloading the partition.

Reason only has eventType, so I'll have to introduce eventId as an argument.


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/DebugUtils.java
File fe/src/main/java/org/apache/impala/util/DebugUtils.java:

http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/DebugUtils.java@77
PS2, Line 77:   // debug action label for mock that metastore returns 
partitions with empty values
> Can you add a comment about the Jira to explain why this error injection is
Done


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java:

http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@212
PS2, Line 212: // This action is required for repro in the unit test 
(MetastoreEventsProcessorTest)
> Can you add a comment about the Jira to explain why this error injection is
Done


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3757
PS2, Line 3757: PartitionValues() is a r
> Can you add a comment about being the regression test for the Jira?
Done


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3837
PS2, Line 3837:   long txnId = 
MetastoreShim.openTransaction(client.getHiveClient());
> Is this needed here? Doesn't have an affect only when Impala processes the 
Not really. To avoid the repetition of setting this flag before 
processEvents(), I have set it here.
I'm moving this line out of this method in the next patch set.


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3847
PS2, Line 3847:   }
> It doesn't seem intuitive to me that we reset this here.
I just want to set this config back to normal, I don't want other events to be 
affected while I'm testing a particular event.
The test passes without setting to the previous value.



--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 21:13:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-14 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21143

to look at the new patch set (#3).

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..

IMPALA-12856: Event processor should ignore processing partition
with empty partition values

While processing partition related events, Event Processor (EP) is
facing IllegalStateException if the partition fetched from HMS has
empty partition values. Even though this is a bug in HMS which returns
partitions with empty values, EP should ignore such partitions instead
of throwing IllegalStateException.

Note: Added a debug option 'mock_empty_partition_values' to add
malformed partition objects.

Testing:
- Manually verified the test provided in jira details in local env.
- Added unit test to return empty partition values and verify EP state.

Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
6 files changed, 139 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/3
--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-14 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 49:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20770/47/be/src/service/workload-management.h
File be/src/service/workload-management.h:

http://gerrit.cloudera.org:8080/#/c/20770/47/be/src/service/workload-management.h@32
PS47, Line 32:
> Need to move them to workload-management-fields.cc according to the failed
Done



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 49
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 21:03:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-14 Thread Michael Smith (Code Review)
Michael Smith has uploaded a new patch set (#49) to the change originally 
created by Jason Fehr. ( http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..

IMPALA-12426: Query History Table

Adds the ability for users to specify that Impala will create and
maintain an internal Iceberg table that contains data about all
completed queries. This table is automatically created at startup by
each coordinator if it does not exist. Then, most completed queries are
queued in memory and flushed to the query history table at a set
interval (either minutes or number of records). Set, use, and show
queries are not written to this table. This commit leverages the
InternalServer class to maintain the query history table.

Ctest unit tests have been added to assert the various pieces of code.
New custom cluster tests have been added to assert the query history
table is properly populated with completed queries.

Negative testing consists of attempting sql injection attacks and
syntactically incorrect queries.

Impala built-in string functions benchmarks have been updated to include
the new built-in functions.

Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/CMakeLists.txt
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/internal-server-test.cc
M be/src/service/internal-server.cc
M be/src/service/internal-server.h
M be/src/service/query-state-record.cc
A be/src/service/workload-management-fields.cc
A be/src/service/workload-management-flags.cc
A be/src/service/workload-management.cc
A be/src/service/workload-management.h
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
A be/src/util/sql-util-test.cc
A be/src/util/sql-util.cc
A be/src/util/sql-util.h
M common/thrift/BackendGflags.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java
M tests/authorization/test_provider.py
M tests/beeswax/impala_beeswax.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_connection.py
A tests/custom_cluster/test_query_log.py
A tests/util/assert_time.py
A tests/util/memory.py
A tests/util/retry.py
A tests/util/workload_management.py
35 files changed, 3,660 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/49
--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 49
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-14 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 48:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20770/47/be/src/service/workload-management.h
File be/src/service/workload-management.h:

http://gerrit.cloudera.org:8080/#/c/20770/47/be/src/service/workload-management.h@32
PS47, Line 32:
> Oops, thank you!
Need to move them to workload-management-fields.cc according to the failed 
build.



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 48
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 20:43:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12904: test type conversions hive3 silently passes because of wrongly defined test dimensions

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21151 )

Change subject: IMPALA-12904: test_type_conversions_hive3 silently passes 
because of wrongly defined test dimensions
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15520/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21151
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I786a5eaae9243b4728484f3f3b1427b20a1d2d28
Gerrit-Change-Number: 21151
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 20:29:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12818: Intermediate Result Caching plan node framework

2024-03-14 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21035 )

Change subject: IMPALA-12818: Intermediate Result Caching plan node framework
..

IMPALA-12818: Intermediate Result Caching plan node framework

This patch adds a plan node framework for caching of intermediate result
tuples within a query. Actual caching of data will be implemented in
subsequent patches.

A new plan node type TupleCacheNode is introduced for brokering caching
decisions at runtime. If the result is in the cache, the TupleCacheNode will
return results from the cache and skip executing its child node. If the
result is not cached, the TupleCacheNode will execute its child node and
mirror the resulting RowBatches to the cache.

The TupleCachePlanner decides where to place the TupleCacheNodes. To
calculate eligibility and cache keys, the plan must be in a stable state
that will not change shape. TupleCachePlanner currently runs at the end
of planning after the DistributedPlanner and ParallelPlanner have run.
As a first cut, TupleCachePlanner places TupleCacheNodes at every
eligible location. Eligibility is currently restricted to immediately
above HdfsScanNodes. This implementation will need to incorporate cost
heuristics and other policies for placement.

Each TupleCacheNode has a hash key that is generated from the logical
plan below for the purpose of identifying results that have been cached
by semantically equivalent query subtrees. The initial implementation of
the subtree hash uses the plan Thrift to uniquely identify the subtree.

Tuple caching is enabled by setting the enable_tuple_cache query option
to true. As a safeguard during development, enable_tuple_cache can only
be set to true if the "allow_tuple_caching" startup option is set to
true. It defaults to false to minimize the impact for production clusters.
bin/start-impala-cluster.py sets allow_tuple_caching=true by default
to enable it in the development environment.

Testing:
 - This adds a frontend test that does basic checks for cache keys and
   eligibility
 - This verifies the presence of the caching information in the explain
   plan output.

Change-Id: Ia1f36a87dcce6efd5d1e1f0bc04009bf009b1961
Reviewed-on: http://gerrit.cloudera.org:8080/21035
Tested-by: Impala Public Jenkins 
Reviewed-by: Michael Smith 
Reviewed-by: Yida Wu 
Reviewed-by: Kurt Deschler 
---
M be/src/exec/CMakeLists.txt
M be/src/exec/exec-node.cc
A be/src/exec/tuple-cache-node.cc
A be/src/exec/tuple-cache-node.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M bin/start-impala-cluster.py
M common/thrift/ImpalaService.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/Query.thrift
A fe/src/main/java/org/apache/impala/common/ThriftSerializationCtx.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
A fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java
A fe/src/main/java/org/apache/impala/planner/TupleCacheNode.java
A fe/src/main/java/org/apache/impala/planner/TupleCachePlanner.java
A fe/src/test/java/org/apache/impala/planner/TupleCacheInfoTest.java
A fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java
M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
21 files changed, 1,124 insertions(+), 16 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Michael Smith: Looks good to me, but someone else must approve
  Yida Wu: Looks good to me, but someone else must approve
  Kurt Deschler: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/21035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia1f36a87dcce6efd5d1e1f0bc04009bf009b1961
Gerrit-Change-Number: 21035
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12904: test type conversions hive3 silently passes because of wrongly defined test dimensions

2024-03-14 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21151 )

Change subject: IMPALA-12904: test_type_conversions_hive3 silently passes 
because of wrongly defined test dimensions
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21151
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I786a5eaae9243b4728484f3f3b1427b20a1d2d28
Gerrit-Change-Number: 21151
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 20:20:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12904: test type conversions hive3 silently passes because of wrongly defined test dimensions

2024-03-14 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21151


Change subject: IMPALA-12904: test_type_conversions_hive3 silently passes 
because of wrongly defined test dimensions
..

IMPALA-12904: test_type_conversions_hive3 silently passes because of wrongly 
defined test dimensions

test_type_conversions_hive3 silently passes because we are not creating
the test dimenstion for query option orc_shema_resolution correctly. If
we set orc_shema_resolution correctly, i.e. to also exercise the
name-based schema resolution, the test fails. The cause of the failure
is that the ill-typed tables have dummy column names like 'c1', 'c2',
etc. These are completely fine for position-based schema resolution,
but it is not OK for name-based schema resolution.

The test just wants to check error messages related to type errors,
the column names are irrelevant, so we can just use the correct
names.

The test was copied from the old test_type_conversions_hive2 which is
not relevant anymore, so this CR also removes it.

Change-Id: I786a5eaae9243b4728484f3f3b1427b20a1d2d28
---
M 
testdata/workloads/functional-query/queries/DataErrorsTest/orc-type-checks.test
M tests/query_test/test_scanners.py
2 files changed, 42 insertions(+), 81 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/21151/1
--
To view, visit http://gerrit.cloudera.org:8080/21151
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I786a5eaae9243b4728484f3f3b1427b20a1d2d28
Gerrit-Change-Number: 21151
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12903: Querying virtual column FILE POSITION for TEXT and JSON tables crashes Impala

2024-03-14 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21148 )

Change subject: IMPALA-12903: Querying virtual column FILE__POSITION for TEXT 
and JSON tables crashes Impala
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21148/1/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/21148/1/tests/query_test/test_scanners.py@183
PS1, Line 183: parquet
> Is it possible to set this to JSON or TEXT? It doesn't affect anything but
Or just fix the table_format dimension to text/none and remove this constraint.

cls.ImpalaTestMatrix.add_dimension(
create_uncompressed_text_dimension(cls.get_workload()))



--
To view, visit http://gerrit.cloudera.org:8080/21148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e1af8d526f9046aceddb5944da9e6f9c63768b0
Gerrit-Change-Number: 21148
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 19:56:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12737: List key columns in query history

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21142 )

Change subject: IMPALA-12737: List key columns in query history
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15518/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
Gerrit-Change-Number: 21142
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 14 Mar 2024 19:28:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12626: Add Tables Queried to profile/history

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20886 )

Change subject: IMPALA-12626: Add Tables Queried to profile/history
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15517/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 16
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Thu, 14 Mar 2024 19:27:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 48:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/15519/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 48
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 19:22:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12626: Add Tables Queried to profile/history

2024-03-14 Thread Michael Smith (Code Review)
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20886

to look at the new patch set (#16).

Change subject: IMPALA-12626: Add Tables Queried to profile/history
..

IMPALA-12626: Add Tables Queried to profile/history

Adds "Tables Queried" to the query profile, enumerating a
comma-separated list of tables accessed during a query. Also adds
"tables_queried" to impala_query_log and impala_query_live with the same
content.

Requires 'drop table sys.impala_query_log' to recreate it with the new
column.

Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
---
M be/src/exec/system-table-scanner.cc
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-state-record.cc
M be/src/service/query-state-record.h
M be/src/service/workload-management-fields.cc
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
M be/src/util/error-util.cc
M common/thrift/Frontend.thrift
M common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_query_live.py
M tests/util/workload_management.py
14 files changed, 71 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20886/16
--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 16
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12737: List key columns in query history

2024-03-14 Thread Michael Smith (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21142

to look at the new patch set (#3).

Change subject: IMPALA-12737: List key columns in query history
..

IMPALA-12737: List key columns in query history

Adds "Key Columns" to the query profile, enumerating a
comma-separated list of columns that are important for estimating
cardinality of the query. Also adds "key_columns" to impala_query_log
and impala_query_live with the same content.

Relevant columns are those referenced in 'where', 'group by', and
'having' clauses of a 'select' query; and the 'where' clause of an
'update' or 'delete' statement.

Requires 'drop table sys.impala_query_log' to recreate it with the new
column.

Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
---
M be/src/exec/system-table-scanner.cc
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-state-record.cc
M be/src/service/query-state-record.h
M be/src/service/workload-management-fields.cc
M common/thrift/Frontend.thrift
M common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/util/workload_management.py
16 files changed, 83 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/3
--
To view, visit http://gerrit.cloudera.org:8080/21142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
Gerrit-Change-Number: 21142
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-14 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 48:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20770/47/be/src/service/workload-management.h
File be/src/service/workload-management.h:

http://gerrit.cloudera.org:8080/#/c/20770/47/be/src/service/workload-management.h@32
PS47, Line 32:
> These declares are no longer needed in the header.
Oops, thank you!



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 48
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 19:05:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-14 Thread Jason Fehr (Code Review)
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20770

to look at the new patch set (#48).

Change subject: IMPALA-12426: Query History Table
..

IMPALA-12426: Query History Table

Adds the ability for users to specify that Impala will create and
maintain an internal Iceberg table that contains data about all
completed queries. This table is automatically created at startup by
each coordinator if it does not exist. Then, most completed queries are
queued in memory and flushed to the query history table at a set
interval (either minutes or number of records). Set, use, and show
queries are not written to this table. This commit leverages the
InternalServer class to maintain the query history table.

Ctest unit tests have been added to assert the various pieces of code.
New custom cluster tests have been added to assert the query history
table is properly populated with completed queries.

Negative testing consists of attempting sql injection attacks and
syntactically incorrect queries.

Impala built-in string functions benchmarks have been updated to include
the new built-in functions.

Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/CMakeLists.txt
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/internal-server-test.cc
M be/src/service/internal-server.cc
M be/src/service/internal-server.h
M be/src/service/query-state-record.cc
A be/src/service/workload-management-fields.cc
A be/src/service/workload-management-flags.cc
A be/src/service/workload-management.cc
A be/src/service/workload-management.h
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
A be/src/util/sql-util-test.cc
A be/src/util/sql-util.cc
A be/src/util/sql-util.h
M common/thrift/BackendGflags.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java
M tests/authorization/test_provider.py
M tests/beeswax/impala_beeswax.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_connection.py
A tests/custom_cluster/test_query_log.py
A tests/util/assert_time.py
A tests/util/memory.py
A tests/util/retry.py
A tests/util/workload_management.py
35 files changed, 3,657 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/48
--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 48
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-14 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 47:

(1 comment)

I like this layout better. Just one comment to address.

http://gerrit.cloudera.org:8080/#/c/20770/47/be/src/service/workload-management.h
File be/src/service/workload-management.h:

http://gerrit.cloudera.org:8080/#/c/20770/47/be/src/service/workload-management.h@32
PS47, Line 32: DECLARE_int32(query_log_max_sql_length);
These declares are no longer needed in the header.



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 47
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 18:58:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12818: Intermediate Result Caching plan node framework

2024-03-14 Thread Kurt Deschler (Code Review)
Kurt Deschler has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21035 )

Change subject: IMPALA-12818: Intermediate Result Caching plan node framework
..


Patch Set 5: Code-Review+2

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21035/3/be/src/exec/tuple-cache-node.cc
File be/src/exec/tuple-cache-node.cc:

http://gerrit.cloudera.org:8080/#/c/21035/3/be/src/exec/tuple-cache-node.cc@69
PS3, Line 69:   // Save the number of rows in case GetNext() is called with a 
non-empty batch,
> Changed this comment to make it clearer. I was trying to say that TupleCach
Done


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java@499
PS3, Line 499:*/
> There will be multiple patches that revisit this function as we do more mas
Done


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java
File fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java:

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java@150
PS3, Line 150:   StringBuilder errorLog = new StringBuilder();
> Changed this to verify the hash trace in the same way. I also changed the c
Done


http://gerrit.cloudera.org:8080/#/c/21035/3/testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
File testdata/workloads/functional-query/queries/QueryTest/explain-level1.test:

http://gerrit.cloudera.org:8080/#/c/21035/3/testdata/workloads/functional-query/queries/QueryTest/explain-level1.test@63
PS3, Line 63: row_regex:.*\[TPlanNode\(.*\]
> This particular test file is checking what the explain output contains at d
Done



--
To view, visit http://gerrit.cloudera.org:8080/21035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1f36a87dcce6efd5d1e1f0bc04009bf009b1961
Gerrit-Change-Number: 21035
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 14 Mar 2024 18:58:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12602: Unregister queries on idle timeout

2024-03-14 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21074 )

Change subject: IMPALA-12602: Unregister queries on idle timeout
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21074/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/21074/5/be/src/service/impala-server.cc@2880
PS5, Line 2880:   expiration_event->query_id, 
move(preserved_status));
> Your right, it could have been after releasing the session lock. I'll move
Done


http://gerrit.cloudera.org:8080/#/c/21074/3/tests/custom_cluster/test_query_expiration.py
File tests/custom_cluster/test_query_expiration.py:

http://gerrit.cloudera.org:8080/#/c/21074/3/tests/custom_cluster/test_query_expiration.py@140
PS3, Line 140: ed queries always return their exception, s
> Yes, I'll update the asserts to test specific queries.
Done. Turns out I was mistaken about short_timeout_expire_handle, it was 
cancelled due to QUERY_TIMEOUT_S so it always return the exception.



--
To view, visit http://gerrit.cloudera.org:8080/21074
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iacfc285ed3587892c7ec6f7df3b5f71c9e41baf0
Gerrit-Change-Number: 21074
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Thu, 14 Mar 2024 18:44:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12602: Unregister queries on idle timeout

2024-03-14 Thread Michael Smith (Code Review)
Hello Joe McDonnell, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21074

to look at the new patch set (#7).

Change subject: IMPALA-12602: Unregister queries on idle timeout
..

IMPALA-12602: Unregister queries on idle timeout

Queries cancelled due to idle_query_timeout/QUERY_TIMEOUT_S are now also
Unregistered to free any remaining memory, as you cannot fetch results
from a cancelled query.

Adds a new structure - idle_query_statuses_ - to retain Status messages
for queries closed this way so that we can continue to return a clear
error message if the client returns and requests query status or
attempts to fetch results. This structure must be global because HS2
server can only identify a session ID from a query handle, and the query
handle no longer exists. SessionState tracks queries added to
idle_query_statuses_ so they can be cleared when the session is closed.

The beeswax get_log RPC will not return the preserved error message or
any warnings for these queries. It's also possible the summary and
profile are rotated out of query log as the query is no longer inflight.
This is an acceptable outcome as a client will likely not look for a
log/summary/profile after it times out.

Testing: updates test_query_expiration to verify number of waiting
queries is only non-zero for queries cancelled by EXEC_TIME_LIMIT_S and
not yet closed as an idle query.

Change-Id: Iacfc285ed3587892c7ec6f7df3b5f71c9e41baf0
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M docs/topics/impala_timeouts.xml
M tests/custom_cluster/test_query_expiration.py
5 files changed, 103 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/21074/7
--
To view, visit http://gerrit.cloudera.org:8080/21074
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iacfc285ed3587892c7ec6f7df3b5f71c9e41baf0
Gerrit-Change-Number: 21074
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 47:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15516/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 47
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 18:35:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12737: List key columns in query history

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21142 )

Change subject: IMPALA-12737: List key columns in query history
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15515/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
Gerrit-Change-Number: 21142
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 14 Mar 2024 18:18:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20762 )

Change subject: IMPALA-12540: Add system.impala_query_live table
..


Patch Set 31:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15513/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
Gerrit-Change-Number: 20762
Gerrit-PatchSet: 31
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Thu, 14 Mar 2024 18:17:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11938: added definition of num nodes query option for test in QueryOptions.SetIntOptions

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21150 )

Change subject: IMPALA-11938: added definition of num_nodes query option  for 
test in QueryOptions.SetIntOptions
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15512/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21150
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccfb7617cc9959f6bc6f1df7c36e47b65972d586
Gerrit-Change-Number: 21150
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 14 Mar 2024 18:14:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12626: Add Tables Queried to profile/history

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20886 )

Change subject: IMPALA-12626: Add Tables Queried to profile/history
..


Patch Set 15:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/15514/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 15
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Thu, 14 Mar 2024 18:10:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-14 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21122 )

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..


Patch Set 5: Code-Review+1

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/lirs-cache-test.cc
File be/src/util/cache/lirs-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/lirs-cache-test.cc@274
PS5, Line 274:
nit. unnecessary new line


http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/lirs-cache-test.cc@424
PS5, Line 424:
nit. unnecessary new line


http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/lirs-cache-test.cc@455
PS5, Line 455:
nit. unnecessary new line


http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/lirs-cache.cc
File be/src/util/cache/lirs-cache.cc:

http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/lirs-cache.cc@990
PS5, Line 990:   CHECK(false);
Maybe add a logging for unexpected state?
CHECK(false) << "Unexpected state: " << e->state();


http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/rl-cache.cc
File be/src/util/cache/rl-cache.cc:

http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/rl-cache.cc@324
PS5, Line 324: while (usage_ > capacity_ && rl_.next != _) {
 :   RLHandle* old = rl_.next;
 :   RL_Remove(old);
 :   table_.Remove(old->key(), old->hash());
 :   if (Unref(old)) {
 : old->next = to_remove_head;
 : to_remove_head = old;
 :   }
 : }
This section of code is identical to a part of RLCacheShard::Insert. 
Could we consider extracting this segment into a separate function for it to 
reduce redundancy?


http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/rl-cache.cc@336
PS5, Line 336:   while (to_remove_head != nullptr) {
 : RLHandle* next = to_remove_head->next;
 : FreeEntry(to_remove_head);
 : to_remove_head = next;
 :   }
Same as above



--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 14 Mar 2024 18:03:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-14 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 47:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20770/46/be/src/service/workload-management.h
File be/src/service/workload-management.h:

http://gerrit.cloudera.org:8080/#/c/20770/46/be/src/service/workload-management.h@519
PS46, Line 519:
> These all rely on workload-management.h only being included by workload-man
The design came out of my naive understanding of how header files worked.  I 
read up header files and what the compiler does when they are included and now 
have a much better understanding.  I refactored to get a clean separation of 
declarations and definitions.


http://gerrit.cloudera.org:8080/#/c/20770/46/tests/util/workload_management.py
File tests/util/workload_management.py:

http://gerrit.cloudera.org:8080/#/c/20770/46/tests/util/workload_management.py@684
PS46, Line 684:   assert sql_results.column_labels[index] == PLAN
> Since you defined all the variables at the top, it'd make sense to use them
Done



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 47
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 18:01:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-14 Thread Jason Fehr (Code Review)
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20770

to look at the new patch set (#47).

Change subject: IMPALA-12426: Query History Table
..

IMPALA-12426: Query History Table

Adds the ability for users to specify that Impala will create and
maintain an internal Iceberg table that contains data about all
completed queries. This table is automatically created at startup by
each coordinator if it does not exist. Then, most completed queries are
queued in memory and flushed to the query history table at a set
interval (either minutes or number of records). Set, use, and show
queries are not written to this table. This commit leverages the
InternalServer class to maintain the query history table.

Ctest unit tests have been added to assert the various pieces of code.
New custom cluster tests have been added to assert the query history
table is properly populated with completed queries.

Negative testing consists of attempting sql injection attacks and
syntactically incorrect queries.

Impala built-in string functions benchmarks have been updated to include
the new built-in functions.

Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/CMakeLists.txt
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/internal-server-test.cc
M be/src/service/internal-server.cc
M be/src/service/internal-server.h
M be/src/service/query-state-record.cc
A be/src/service/workload-management-fields.cc
A be/src/service/workload-management-flags.cc
A be/src/service/workload-management.cc
A be/src/service/workload-management.h
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
A be/src/util/sql-util-test.cc
A be/src/util/sql-util.cc
A be/src/util/sql-util.h
M common/thrift/BackendGflags.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java
M tests/authorization/test_provider.py
M tests/beeswax/impala_beeswax.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_connection.py
A tests/custom_cluster/test_query_log.py
A tests/util/assert_time.py
A tests/util/memory.py
A tests/util/retry.py
A tests/util/workload_management.py
35 files changed, 3,661 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/47
--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 47
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12626: Add Tables Queried to profile/history

2024-03-14 Thread Michael Smith (Code Review)
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20886

to look at the new patch set (#15).

Change subject: IMPALA-12626: Add Tables Queried to profile/history
..

IMPALA-12626: Add Tables Queried to profile/history

Adds "Tables Queried" to the query profile, enumerating a
comma-separated list of tables accessed during a query. Also adds
"tables_queried" to impala_query_log and impala_query_live with the same
content.

Requires 'drop table sys.impala_query_log' to recreate it with the new
column.

Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
---
M be/src/exec/system-table-scanner.cc
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-state-record.cc
M be/src/service/query-state-record.h
M be/src/service/workload-management.h
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
M be/src/util/error-util.cc
M common/thrift/Frontend.thrift
M common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_query_live.py
M tests/util/workload_management.py
14 files changed, 72 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20886/15
-- 
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 15
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table

2024-03-14 Thread Michael Smith (Code Review)
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20762

to look at the new patch set (#31).

Change subject: IMPALA-12540: Add system.impala_query_live table
..

IMPALA-12540: Add system.impala_query_live table

Defines SystemTable which are in-memory tables that can provide access
to Impala state. Adds the 'impala_query_live' to the database 'sys',
which already exists for 'sys.impala_query_log'.

Implements the 'impala_query_live' table to view active queries across
all coordinators sharing the same statestore. SystemTables create new
SystemTableScanNodes for their scan node implementation. When computing
scan range locations, SystemTableScanNodes creates a scan range for each
in the cluster (identified via ClusterMembershipMgr). This produces a
plan that looks like:

Query: explain select * from sys.impala_query_live
++
| Explain String |
++
| Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 |
| Per-Host Resource Estimates: Memory=11MB   |
| WARNING: The following tables are missing relevant table   |
| and/or column statistics.  |
| sys.impala_query_live   |
||
| PLAN-ROOT SINK |
| |  |
| 01:EXCHANGE [UNPARTITIONED]|
| |  |
| 00:SCAN SYSTEM_TABLE [sys.impala_query_live]|
|row-size=72B cardinality=20 |
++

Execution will pull data from ImpalaServer on the backend via a
SystemTableScanner implementation based on table name.

In the query profile, SYSTEM_TABLE_SCAN_NODE includes
ActiveQueryCollectionTime and PendingQueryCollectionTime to track time
spent collecting QueryState from ImpalaServer.

Grants QueryScanner private access to ImpalaServer, identical to how
ImpalaHttpHandler access internal server state.

Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
---
M be/src/exec/CMakeLists.txt
M be/src/exec/exec-node.cc
M be/src/exec/scan-node.cc
A be/src/exec/system-table-scan-node.cc
A be/src/exec/system-table-scan-node.h
A be/src/exec/system-table-scanner.cc
A be/src/exec/system-table-scanner.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/scheduling/scheduler.cc
M be/src/service/fe-support.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-state-record.h
M be/src/service/workload-management.cc
M be/src/util/sharded-query-map-util.h
M common/thrift/CMakeLists.txt
M common/thrift/CatalogObjects.thrift
M common/thrift/Descriptors.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/StatestoreService.thrift
A common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java
M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
A fe/src/main/java/org/apache/impala/catalog/SystemTable.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java
M fe/src/main/java/org/apache/impala/service/FeSupport.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A tests/custom_cluster/test_query_live.py
M tests/util/workload_management.py
36 files changed, 1,290 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/20762/31
--
To view, visit http://gerrit.cloudera.org:8080/20762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
Gerrit-Change-Number: 20762
Gerrit-PatchSet: 31
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12737: List key columns in query history

2024-03-14 Thread Michael Smith (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21142

to look at the new patch set (#2).

Change subject: IMPALA-12737: List key columns in query history
..

IMPALA-12737: List key columns in query history

Adds "Key Columns" to the query profile, enumerating a
comma-separated list of columns that are important for estimating
cardinality of the query. Also adds "key_columns" to impala_query_log
and impala_query_live with the same content.

Relevant columns are those referenced in 'where', 'group by', and
'having' clauses of a 'select' query; and the 'where' clause of an
'update' or 'delete' statement.

Requires 'drop table sys.impala_query_log' to recreate it with the new
column.

Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
---
M be/src/exec/system-table-scanner.cc
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-state-record.cc
M be/src/service/query-state-record.h
M be/src/service/workload-management.h
M common/thrift/Frontend.thrift
M common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/util/workload_management.py
16 files changed, 83 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21142/2
--
To view, visit http://gerrit.cloudera.org:8080/21142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7
Gerrit-Change-Number: 21142
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-11938: added definition of num nodes query option for test in QueryOptions.SetIntOptions

2024-03-14 Thread Anonymous Coward (Code Review)
j_ansh...@yahoo.in has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21150


Change subject: IMPALA-11938: added definition of num_nodes query option  for 
test in QueryOptions.SetIntOptions
..

IMPALA-11938: added definition of num_nodes query option
 for test in QueryOptions.SetIntOptions

Change-Id: Iccfb7617cc9959f6bc6f1df7c36e47b65972d586
---
M be/src/service/query-options-test.cc
1 file changed, 1 insertion(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/21150/2
--
To view, visit http://gerrit.cloudera.org:8080/21150
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iccfb7617cc9959f6bc6f1df7c36e47b65972d586
Gerrit-Change-Number: 21150
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 


[Impala-ASF-CR] IMPALA-11938: added definition of num nodes query option for test in QueryOptions.SetIntOptions

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21150 )

Change subject: IMPALA-11938: added definition of num_nodes query option  for 
test in QueryOptions.SetIntOptions
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21150/2/be/src/service/query-options-test.cc
File be/src/service/query-options-test.cc:

http://gerrit.cloudera.org:8080/#/c/21150/2/be/src/service/query-options-test.cc@273
PS2, Line 273:   {MAKE_OPTIONDEF(num_nodes),{0, 1}},
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/21150
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccfb7617cc9959f6bc6f1df7c36e47b65972d586
Gerrit-Change-Number: 21150
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 14 Mar 2024 17:51:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12903: Querying virtual column FILE POSITION for TEXT and JSON tables crashes Impala

2024-03-14 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21148 )

Change subject: IMPALA-12903: Querying virtual column FILE__POSITION for TEXT 
and JSON tables crashes Impala
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21148/1/testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-generic.test
File 
testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-generic.test:

http://gerrit.cloudera.org:8080/#/c/21148/1/testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-generic.test@159
PS1, Line 159: select file__position, year, month, id, date_string_col
nit: could you add a comment that in this test we prune partitions that doesn't 
support virtual columns so the query can run successfully?


http://gerrit.cloudera.org:8080/#/c/21148/1/testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-negative.test
File 
testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-negative.test:

http://gerrit.cloudera.org:8080/#/c/21148/1/testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-negative.test@40
PS1, Line 40: Virtual column FILE__POSITION is not supported
Could you replace some of the FILE__POSITIONS to some other virtual columns 
like INPUT__FILE__NAME?



--
To view, visit http://gerrit.cloudera.org:8080/21148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e1af8d526f9046aceddb5944da9e6f9c63768b0
Gerrit-Change-Number: 21148
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 14 Mar 2024 16:58:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12729: Allow creating primary keys for Iceberg tables

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21149 )

Change subject: IMPALA-12729: Allow creating primary keys for Iceberg tables
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15511/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7bea787acdabd8cb04661f4ddb5c3309af0364a6
Gerrit-Change-Number: 21149
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 14 Mar 2024 16:57:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12602: Unregister queries on idle timeout

2024-03-14 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21074 )

Change subject: IMPALA-12602: Unregister queries on idle timeout
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21074/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/21074/5/be/src/service/impala-server.cc@2880
PS5, Line 2880:
> Is it possible for the session to be closed at this point? Wouldn't this me
Your right, it could have been after releasing the session lock. I'll move this 
to happen within the session lock above.

Locking order could be a little tricky, I'll have to look into that.


http://gerrit.cloudera.org:8080/#/c/21074/3/tests/custom_cluster/test_query_expiration.py
File tests/custom_cluster/test_query_expiration.py:

http://gerrit.cloudera.org:8080/#/c/21074/3/tests/custom_cluster/test_query_expiration.py@140
PS3, Line 140: 'Invalid or unknown query handle' in str(e)
> Could this be made stricter by checking whether the handle is in [short_tim
Yes, I'll update the asserts to test specific queries.



--
To view, visit http://gerrit.cloudera.org:8080/21074
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iacfc285ed3587892c7ec6f7df3b5f71c9e41baf0
Gerrit-Change-Number: 21074
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Thu, 14 Mar 2024 16:46:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12903: Querying virtual column FILE POSITION for TEXT and JSON tables crashes Impala

2024-03-14 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21148 )

Change subject: IMPALA-12903: Querying virtual column FILE__POSITION for TEXT 
and JSON tables crashes Impala
..


Patch Set 1:

(4 comments)

Thanks for the fix.

http://gerrit.cloudera.org:8080/#/c/21148/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21148/1//COMMIT_MSG@19
PS1, Line 19: let
Nit: lets.


http://gerrit.cloudera.org:8080/#/c/21148/1/testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-generic.test
File 
testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-generic.test:

http://gerrit.cloudera.org:8080/#/c/21148/1/testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-generic.test@158
PS1, Line 158:  QUERY
Are these the queries where some files in the table do not support 
FILE_POSITION but after pruning the remaining ones do? You could describe the 
situation in a comment (and possibly include "Regression test for 
IMPALA-12903").


http://gerrit.cloudera.org:8080/#/c/21148/1/testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-negative.test
File 
testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-negative.test:

http://gerrit.cloudera.org:8080/#/c/21148/1/testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-negative.test@1
PS1, Line 1: 
Is FILE_POSITION the only virtual column that could cause this bug before this 
change? If there are others, we could have them instead of FILE_POSITION in 
some of these queries (and possibly also in 
virtual-column-file-position-generic.test).


http://gerrit.cloudera.org:8080/#/c/21148/1/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/21148/1/tests/query_test/test_scanners.py@183
PS1, Line 183: parquet
Is it possible to set this to JSON or TEXT? It doesn't affect anything but when 
reading the code 'parquet' could be confusing.



--
To view, visit http://gerrit.cloudera.org:8080/21148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e1af8d526f9046aceddb5944da9e6f9c63768b0
Gerrit-Change-Number: 21148
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 14 Mar 2024 16:45:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12903: Querying virtual column FILE POSITION for TEXT and JSON tables crashes Impala

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21148 )

Change subject: IMPALA-12903: Querying virtual column FILE__POSITION for TEXT 
and JSON tables crashes Impala
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15510/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e1af8d526f9046aceddb5944da9e6f9c63768b0
Gerrit-Change-Number: 21148
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 14 Mar 2024 16:47:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12729: Allow creating primary keys for Iceberg tables

2024-03-14 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21149


Change subject: IMPALA-12729: Allow creating primary keys for Iceberg tables
..

IMPALA-12729: Allow creating primary keys for Iceberg tables

There are writer engines that use Iceberg's identifier-field-ids from
the Iceberg schema to identify the columns to be written into the
equality delete files (Flink, NiFi). So far Impala wasn't able to
populate this identifier-field-ids. This patch introduces the support
for not enforced primary keys for Iceberg tables, where the primary key
is going to be used for setting identifier-field-ids during Iceberg
schema creation.

Example syntax:
CREATE TABLE ice_tbl (
  i int NOT NULL,
  j int,
  s string NOT NULL
  primary key(i, s) not enforced)
PARTITIONED BY SPEC (truncate(10, s))
STORED AS ICEBERG;

There are some constraints with primary keys (PK) following the
behavior of Flink:
 - Only NOT NULL columns can be in the PK.
 - PK is not allowed in the column definition level like
   'i int NOT NULL PRIMARY KEY'.
 - If the table is partitioned then the partition columns have to be
   part of the PK.
 - Float and double columns are not allowed for the PK.
 - Not allowed to drop a column that is used as a PK.

Testing:
 - New E2E tests added for different table creation scenarios.
 - Manual test to use Nifi for writing into a table with PK.

Change-Id: I7bea787acdabd8cb04661f4ddb5c3309af0364a6
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java
M fe/src/main/jflex/sql-scanner.flex
M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
13 files changed, 261 insertions(+), 48 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/21149/1
--
To view, visit http://gerrit.cloudera.org:8080/21149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7bea787acdabd8cb04661f4ddb5c3309af0364a6
Gerrit-Change-Number: 21149
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 


[Impala-ASF-CR] IMPALA-12729: Allow creating primary keys for Iceberg tables

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21149 )

Change subject: IMPALA-12729: Allow creating primary keys for Iceberg tables
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21149/1/fe/src/main/java/org/apache/impala/analysis/TableDef.java
File fe/src/main/java/org/apache/impala/analysis/TableDef.java:

http://gerrit.cloudera.org:8080/#/c/21149/1/fe/src/main/java/org/apache/impala/analysis/TableDef.java@581
PS1, Line 581:   throw new AnalysisException("Partition columns have to 
be part of the primary " +
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/21149/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21149/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3909
PS1, Line 3909: partitionSpec, primaryKeyColumnNames, 
newTable.getOwner(), tableProperties).location();
line too long (103 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/21149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7bea787acdabd8cb04661f4ddb5c3309af0364a6
Gerrit-Change-Number: 21149
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 14 Mar 2024 16:32:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12903: Querying virtual column FILE POSITION for TEXT and JSON tables crashes Impala

2024-03-14 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21148


Change subject: IMPALA-12903: Querying virtual column FILE__POSITION for TEXT 
and JSON tables crashes Impala
..

IMPALA-12903: Querying virtual column FILE__POSITION for TEXT and JSON tables 
crashes Impala

Impala generates segmentation fault when it queries the virtual column
FILE__POSITION for TEXT or JSON tables. When the scanners that do not
support the FILE__POSITION virtual column detect its presence they
try to report an error and close themselves. The segfault is in the
scanners' Close() method when they try to dereference a NULL stream
object.

This patch simply adds NULL-checks in Close().

Alternatively we could detect the presence of FILE__POSITION during
planning in the HdfsScanNode, but doing it in the scanners let us
handle more queries, e.g. queries that dynamically prune partitions
and the surviving partitions all have file formats that support
FILE__POSITION.

Testing:
 * added negative tests to properly report the errors
 * added tests for mixed file format tables

Change-Id: I8e1af8d526f9046aceddb5944da9e6f9c63768b0
---
M be/src/exec/json/hdfs-json-scanner.cc
M be/src/exec/text/hdfs-text-scanner.cc
M 
testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-generic.test
A 
testdata/workloads/functional-query/queries/QueryTest/virtual-column-file-position-negative.test
M tests/query_test/test_scanners.py
5 files changed, 88 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/21148/1
--
To view, visit http://gerrit.cloudera.org:8080/21148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8e1af8d526f9046aceddb5944da9e6f9c63768b0
Gerrit-Change-Number: 21148
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-11938: If NUM NODES is set to a value other than 0 or 1 below error is raised ERROR: Invalid value for query option NUM NODES: is not in range [0, 1]

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21147 )

Change subject: IMPALA-11938: If NUM_NODES is set to a value other than 0 or 1 
below error is raisedERROR: Invalid value for query option 
NUM_NODES:  is not in range [0, 1]
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15509/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21147
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib283ae350875a7ac217490c216a26281f1bb4812
Gerrit-Change-Number: 21147
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 16:26:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11938: If NUM NODES is set to a value other than 0 or 1 below error is raised ERROR: Invalid value for query option NUM NODES: is not in range [0, 1]

2024-03-14 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21147 )

Change subject: IMPALA-11938: If NUM_NODES is set to a value other than 0 or 1 
below error is raisedERROR: Invalid value for query option 
NUM_NODES:  is not in range [0, 1]
..


Patch Set 2:

Thank you for fixing this.
Please also add test in QueryOptions.SetIntOptions. Adding the following line 
in case_set should do it.

{MAKE_OPTIONDEF(num_nodes),{0, 1}},

https://github.com/apache/impala/blob/691604b1d1f0e5f0dc95fdb4976cf826135e08fb/be/src/service/query-options-test.cc#L247-L248


--
To view, visit http://gerrit.cloudera.org:8080/21147
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib283ae350875a7ac217490c216a26281f1bb4812
Gerrit-Change-Number: 21147
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 16:19:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11938: If NUM NODES is set to a value other than 0 or 1 below error is raised ERROR: Invalid value for query option NUM NODES: is not in range [0, 1]

2024-03-14 Thread Anonymous Coward (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21147

to look at the new patch set (#2).

Change subject: IMPALA-11938: If NUM_NODES is set to a value other than 0 or 1 
below error is raisedERROR: Invalid value for query option 
NUM_NODES:  is not in range [0, 1]
..

IMPALA-11938: If NUM_NODES is set to a value other than 0 or 1 below error is 
raised

ERROR: Invalid value for query option NUM_NODES:  is not in range [0, 1]

Change-Id: Ib283ae350875a7ac217490c216a26281f1bb4812
---
M be/src/service/query-options.cc
1 file changed, 2 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/21147/2
--
To view, visit http://gerrit.cloudera.org:8080/21147
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib283ae350875a7ac217490c216a26281f1bb4812
Gerrit-Change-Number: 21147
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-11938: If NUM NODES is set to a value other than 0 or 1 below error is raised ERROR: Invalid value for query option NUM NODES: is not in range [0, 1]

2024-03-14 Thread Anonymous Coward (Code Review)
j_ansh...@yahoo.in has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21147


Change subject: IMPALA-11938: If NUM_NODES is set to a value other than 0 or 1 
below error is raised   ERROR: Invalid value for query option 
NUM_NODES:  is not in range [0, 1]
..

IMPALA-11938: If NUM_NODES is set to a value other than 0 or 1 below error is 
raised
  ERROR: Invalid value for query option NUM_NODES:  is not 
in range [0, 1]

Change-Id: Ib283ae350875a7ac217490c216a26281f1bb4812
---
M be/src/service/query-options.cc
1 file changed, 2 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/21147/1
--
To view, visit http://gerrit.cloudera.org:8080/21147
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib283ae350875a7ac217490c216a26281f1bb4812
Gerrit-Change-Number: 21147
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 


[Impala-ASF-CR] IMPALA-11938: If NUM NODES is set to a value other than 0 or 1 below error is raised ERROR: Invalid value for query option NUM NODES: is not in range [0, 1]

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21147 )

Change subject: IMPALA-11938: If NUM_NODES is set to a value other than 0 or 1 
below error is raised   ERROR: Invalid value for query option 
NUM_NODES:  is not in range [0, 1]
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21147/1/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/21147/1/be/src/service/query-options.cc@246
PS1, Line 246: option, value, 0, 1, _t_val));
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/21147
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib283ae350875a7ac217490c216a26281f1bb4812
Gerrit-Change-Number: 21147
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 14 Mar 2024 16:04:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-03-14 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 5:

> Patch Set 4: Code-Review+1
>
> (1 comment)

I add more notes at commit message as documentation.
Hope it answer your questions.


--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 15:06:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-03-14 Thread Riza Suminto (Code Review)
Hello Daniel Becker, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21118

to look at the new patch set (#5).

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..

IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

IMPALA-12018 adds reduceCardinalityForScanNode to lower cardinality
estimation when a runtime filter is involved. It calls
JoinNode.computeGenericJoinCardinality(). However, if the originating
join node has FK-PK conjunct, it should be possible to obtain a lower
cardinality estimate by calling JoinNode.getFkPkJoinCardinality()
instead.

This patch adds that analysis and calls
JoinNode.getFkPkJoinCardinality() when possible. It is, however, only
limited to runtime filters that evaluate at the storage layer, such as
partition filter and pushed-down Kudu filter. Row-level runtime filters
that evaluate at scan node will continue using
JoinNode.computeGenericJoinCardinality().

This distinction is because a storage layer filter is applied more
consistently than a row-level filter. For example, a partition filter
evaluate all partition_id and never disabled regardless of its
precision (see HdfsScanNodeBase::PartitionPassesFilters). On the other
hand, scan node can disable a row-level filter later on if it is deemed
ineffective / not precise enough (see
HdfsScanner::CheckFiltersEffectiveness,
LocalFilterStats::enabled_for_row, and min_filter_reject_ratio flag).
For the pushed-down Kudu filter, Impala will rely on Kudu to evaluate
the filter.

Runtime filters can arrive late as well. But for both storage layer
filter and row-level filter, the scan node can stop waiting and start
scanning after runtime_filter_wait_time_ms passed. Scan node will still
evaluate A late runtime filter later on if the scan process is still
ongoing.

Also, note that this cardinality reduction algorithm is based only on
highly selective runtime filters to increase its estimate
confidence (see RuntimeFilter.isHighlySelective()).

Testing:
- Update TpcdsCpuCostPlannerTest.
- Pass FE tests.

Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
---
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/outer-to-inner-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-cardinality-reduction-on-kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-cardinality-reduction.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q46.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q49.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q52.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q53.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q55.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q56.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q60.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q61.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q63.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q66.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q75.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q89.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test
M 

[Impala-ASF-CR] IMPALA-12264: Add limit on number of HS2 sessions per user.

2024-03-14 Thread Xiang Yang (Code Review)
Xiang Yang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21128 )

Change subject: IMPALA-12264: Add limit on number of HS2 sessions per user.
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21128/1/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

http://gerrit.cloudera.org:8080/#/c/21128/1/be/src/service/impala-hs2-server.cc@491
PS1, Line 491: if (FLAGS_max_hs2_sessions_per_user > 0
maybe we should pull this line at top of the function to avoid unnecessary lock.


http://gerrit.cloudera.org:8080/#/c/21128/1/be/src/service/impala-hs2-server.cc@492
PS1, Line 492: && load + 1 > FLAGS_max_hs2_sessions_per_user) {
nit:we can replace it to 'load >= FLAGS_max_hs2_sessions_per_user' to avoid 
doing addition.


http://gerrit.cloudera.org:8080/#/c/21128/1/be/src/service/impala-hs2-server.cc@493
PS1, Line 493:   return Status::Expected("Number of sessions for user 
exceeds coordinator limit");
should we log some message here? and should we return the threshold to the 
client?



--
To view, visit http://gerrit.cloudera.org:8080/21128
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd28edc352102d89774f6ece5376e7c79ae41aa8
Gerrit-Change-Number: 21128
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xiang Yang 
Gerrit-Comment-Date: Thu, 14 Mar 2024 14:28:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12835: Fix event processing without hms event incremental refresh transactional table

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21116 )

Change subject: IMPALA-12835: Fix event processing without 
hms_event_incremental_refresh_transactional_table
..

IMPALA-12835: Fix event processing without 
hms_event_incremental_refresh_transactional_table

If hms_event_incremental_refresh_transactional_table is false, then
for non-partitioned ACID tables Impala needs to rely on alter table
event to detect INSERTs in Hive. This patch changes the event processor
to not skip reloading files when processing the alter table event
for this specific type of table (even if the changes in the table
look trivial).

Testing:
- added a simple regression test

Change-Id: I137b289f0e5f7c9c1947e2a3b30258c979f20987
Reviewed-on: http://gerrit.cloudera.org:8080/21116
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M tests/custom_cluster/test_events_custom_configs.py
2 files changed, 38 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I137b289f0e5f7c9c1947e2a3b30258c979f20987
Gerrit-Change-Number: 21116
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12835: Fix event processing without hms event incremental refresh transactional table

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21116 )

Change subject: IMPALA-12835: Fix event processing without 
hms_event_incremental_refresh_transactional_table
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I137b289f0e5f7c9c1947e2a3b30258c979f20987
Gerrit-Change-Number: 21116
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 14:13:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12699: Set timeout for catalog RPCs

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21146 )

Change subject: IMPALA-12699: Set timeout for catalog RPCs
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/15508/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/21146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad39a79d0c89f2b04380f610a7e60558429e9c6e
Gerrit-Change-Number: 21146
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 14 Mar 2024 11:20:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-14 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2846
PS1, Line 2846:   public int reloadPartitionsFromNames(long eventId, String 
eventType,
> Ack
Shouldn't the 'reason' already include the event type?

I think that ideally 'reason' would also contain the eventId, so no additional 
argument would have to passed just for the sake of logging, but this seems like 
a larger change.


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/DebugUtils.java
File fe/src/main/java/org/apache/impala/util/DebugUtils.java:

http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/DebugUtils.java@77
PS2, Line 77:   // debug action label for mock that metastore returns 
partitions with empty values
Can you add a comment about the Jira to explain why this error injection is 
added?


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java:

http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@212
PS2, Line 212: if 
(DebugUtils.hasDebugAction(BackendConfig.INSTANCE.debugActions(),
Can you add a comment about the Jira to explain why this error injection is 
added?


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3757
PS2, Line 3757: testEmptyPartitionValues
Can you add a comment about being the regression test for the Jira?


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3837
PS2, Line 3837:   
BackendConfig.INSTANCE.setDebugActions(DebugUtils.MOCK_EMPTY_PARTITION_VALUES);
Is this needed here? Doesn't have an affect only when Impala processes the 
event or reloads the table?
It also doesn't seem intuitive that this function has the side effect of 
changing the backed flags.


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3847
PS2, Line 3847: BackendConfig.INSTANCE.setDebugActions(prevFlag);
It doesn't seem intuitive to me that we reset this here.



--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 11:10:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12699: Set timeout for catalog RPCs

2024-03-14 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21146


Change subject: IMPALA-12699: Set timeout for catalog RPCs
..

IMPALA-12699: Set timeout for catalog RPCs

We have seen trivial GetPartialCatalogObject RPCs hanging in coordinator
side, e.g. IMPALA-11409. Due to the piggyback mechanism of fetching
metadata in local-catalog mode (see comments in
CatalogdMetaProvider#loadWithCaching()), a hanging RPC on shared
metadata (e.g. db/table list) could block other queries on the same
coordinator.

Such lightweight requests don't need to acquire table lock or trigger
table loading in catalogd. The causes of the hanging are usually
network issues, e.g. TCP connection become half open due to TCP
retransmissions timed out. A retry on the RPC helps to recover from such
failures. Currently, the timeout for catalog RPC is set to 0 by default.
This prevent the retry and let the client to wait infinitely.

This patch distinguishes the lightweight catalog RPCs and uses a
dedicated catalogd client cache for them. They use a timeout of 30 mins
which is longer enough to tolerate TCP retransmission timeouts.
Also sets a timeout of 10 hours for other catalog RPCs. Operations take
longer than that are usually abnormal and hanging.

Tests
 - Add e2e test to verify the lightweight RPC client cache is used.

This is a continuation of patch by Wenzhe Zhou 

Change-Id: Iad39a79d0c89f2b04380f610a7e60558429e9c6e
---
M be/src/exec/catalog-op-executor.cc
M be/src/runtime/client-cache.cc
M be/src/runtime/client-cache.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M common/thrift/metrics.json
M tests/custom_cluster/test_local_catalog.py
7 files changed, 80 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/21146/1
--
To view, visit http://gerrit.cloudera.org:8080/21146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iad39a79d0c89f2b04380f610a7e60558429e9c6e
Gerrit-Change-Number: 21146
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality

2024-03-14 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21118 )

Change subject: IMPALA-12881: Use getFkPkJoinCardinality to reduce scan 
cardinality
..


Patch Set 4: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21118/3//COMMIT_MSG@19
PS3, Line 19: hed-down Kudu filter, to avoid severe
: underestimation. Runtime filters that evaluate at scan node at row
> In IMPALA-12018, I don't make distinction on whether a runtime filter still
>It looks like row-level runtime filter is either late or disabled by scan node 
>due to ineffectiveness.

about being late: does Impala always wait for filters on partitions? + same 
question for Kudu filters

disabling due to ineffectiveness seems strange, as if the filter if dropping a 
lot of rows, then it shouldn't be disabled, while if it doesn't drop many rows, 
then we have overestimated its selectiveness, which should be a problem for 
partition filters too

I don't want to block this change with these concerns, but more explanation 
could be added about the decision and the non-partitioned case probably needs 
more investigation



--
To view, visit http://gerrit.cloudera.org:8080/21118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Gerrit-Change-Number: 21118
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 14 Mar 2024 10:44:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12835: Fix event processing without hms event incremental refresh transactional table

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21116 )

Change subject: IMPALA-12835: Fix event processing without 
hms_event_incremental_refresh_transactional_table
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I137b289f0e5f7c9c1947e2a3b30258c979f20987
Gerrit-Change-Number: 21116
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 09:22:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12835: Fix event processing without hms event incremental refresh transactional table

2024-03-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21116 )

Change subject: IMPALA-12835: Fix event processing without 
hms_event_incremental_refresh_transactional_table
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10376/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I137b289f0e5f7c9c1947e2a3b30258c979f20987
Gerrit-Change-Number: 21116
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 09:22:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12835: Fix event processing without hms event incremental refresh transactional table

2024-03-14 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21116 )

Change subject: IMPALA-12835: Fix event processing without 
hms_event_incremental_refresh_transactional_table
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I137b289f0e5f7c9c1947e2a3b30258c979f20987
Gerrit-Change-Number: 21116
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 09:01:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12835: Fix event processing without hms event incremental refresh transactional table

2024-03-14 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21116 )

Change subject: IMPALA-12835: Fix event processing without 
hms_event_incremental_refresh_transactional_table
..


Patch Set 5:

There are some broken tests when 
hms_event_incremental_refresh_transactional_table=false
Created tickets IMPALA-12895 and IMPALA-12902

It doesn't seem easy to fix these issue - it may make more sense to remove 
hms_event_incremental_refresh_transactional_table as we cannot guarantee 
correct behavior when it is false.


--
To view, visit http://gerrit.cloudera.org:8080/21116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I137b289f0e5f7c9c1947e2a3b30258c979f20987
Gerrit-Change-Number: 21116
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 08:50:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-14 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 9: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20986/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20986/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1192
PS3, Line 1192:* @param events List of NotificationEvents
I am ok with merging the current state, but I find it pretty confusing. Doing 
this in a synchronized way may not be really necessary but would make to logic 
easier to understand.

> and the recent 10 failures

I think that storing at least 1 failure is the most critical as generally there 
should be 0 failures.

A few more thoughts for storing failures:
- after IMPALA-12832 this part of the code would not see table level failures 
that were solved with invalidating the table. It would be nice to store those 
too
- besides storing the event the call stack would be also useful


http://gerrit.cloudera.org:8080/#/c/20986/9/www/events.tmpl
File www/events.tmpl:

http://gerrit.cloudera.org:8080/#/c/20986/9/www/events.tmpl@57
PS9, Line 57: synced
Maybe processed would be clearer?



--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 08:09:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12362: (part-1/4) Refactor service management scripts.

2024-03-14 Thread Zihao Ye (Code Review)
Zihao Ye has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20921 )

Change subject: IMPALA-12362: (part-1/4) Refactor service management scripts.
..


Patch Set 10:

(2 comments)

Very useful feature, thanks for working on this!

http://gerrit.cloudera.org:8080/#/c/20921/10/package/bin/impala.sh
File package/bin/impala.sh:

http://gerrit.cloudera.org:8080/#/c/20921/10/package/bin/impala.sh@67
PS10, Line 67: kill $pid
It would be better if we had a 'wait_for_stop' called here to ensure that the 
process has indeed ended before we proceed.
Something like this maybe work:
  while kill -0 $pid > /dev/null 2>&1; do
sleep 1
  done


http://gerrit.cloudera.org:8080/#/c/20921/10/package/bin/impala.sh@120
PS10, Line 120: sleep 1
The original script had a 'wait_for_ready' following this to wait for the 
completion of the service startup. It now seems to have been replaced with 
'health'. Should we also call 'health' here to ensure the service has started 
properly?



--
To view, visit http://gerrit.cloudera.org:8080/20921
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f4dcad9cfa12d351d562e7ef8c0a8957d3ca147
Gerrit-Change-Number: 20921
Gerrit-PatchSet: 10
Gerrit-Owner: Xiang Yang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xiang Yang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Thu, 14 Mar 2024 07:39:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL

2024-03-14 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21029 )

Change subject: IMPALA-12543: Detect self-events before finishing DDL
..


Patch Set 16:

(1 comment)

Overall looks good to me. Just have a minor comment. Thanks for adding the 
tests!

http://gerrit.cloudera.org:8080/#/c/21029/16/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21029/16/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3429
PS16, Line 3429:   modification.registerInflightEvent();
This will introduce dangling version if !isTableBeingReplicated and 
!params.isDelete_stats(). In such case, no HMS operations will be invoked.



--
To view, visit http://gerrit.cloudera.org:8080/21029
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79
Gerrit-Change-Number: 21029
Gerrit-PatchSet: 16
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 07:08:15 +
Gerrit-HasComments: Yes