[Impala-ASF-CR] IMPALA-12906: Incorporate scan range information into the tuple cache key

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21541 )

Change subject: IMPALA-12906: Incorporate scan range information into the tuple 
cache key
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16404/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21541
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe298fff0f644ce931a2aa934ebb98f69aab9d34
Gerrit-Change-Number: 21541
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 20 Jun 2024 05:17:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12906: Incorporate scan range information into the tuple cache key

2024-06-19 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21541


Change subject: IMPALA-12906: Incorporate scan range information into the tuple 
cache key
..

IMPALA-12906: Incorporate scan range information into the tuple cache key

This change is accomplishing two things:
1. It incorporates scan range information into the tuple
   cache key.
2. It reintroduces deterministic scheduling as an option
   for mt_dop and uses it for HdfsScanNodes that feed
   into a TupleCacheNode.

The combination of these two things solves several problems:
1. When a table is modified, the list of scan ranges will
   change, and this will naturally change the cache keys.
2. When accessing a partitioned table, two queries may have
   different predicates on the partition columns. Since the
   predicates can be satisfied via partition pruning, they are
   not included at runtime. This means that two queries
   may have identical compile-time keys with only the scan
   ranges being different due to different partition pruning.
3. Each fragment instance processes different scan ranges, so
   each will have a unique cache key.

To incorporate scan range information, this introduces a new
per-fragment-instance cache key. At compile time, the planner
now keeps track of which HdfsScanNodes feed into a TupleCacheNode.
This is passed over to the runtime as a list of plan node ids
that contain scan ranges. At runtime, the fragment instance
walks through the list of plan nodes ids and hashes any scan ranges
associated with them. This hash is the per-fragment-instance
cache key. The combination of the compile-time cache key produced
by the planner and the per-fragment-instance cache key is a unique
identifier of the result.

Deterministic scheduling for mt_dop was removed via IMPALA-9655
with the introduction of the shared queue. This revives some of
the pre-IMPALA-9655 scheduling logic as an option. Since the
TupleCacheNode knows which HdfsScanNodes feed into it, the
TupleCacheNode turns on deterministic scheduling for all of those
HdfsScanNodes. Since this only applies to HdfsScanNodes that feed
into a TupleCacheNode, it means that any HdfsScanNode that doesn't
feed into a TupleCacheNode continues using the current algorithm.

Testing:
 - Added custom cluster tests for the scan range information
   including modifying a table, selecting from a partitioned
   table, and verifying that fragment instances have unique
   keys
 - Added basic frontend test to verify that deterministic scheduling
   gets set on the HdfsScanNode that feed into the TupleCacheNode.

Change-Id: Ibe298fff0f644ce931a2aa934ebb98f69aab9d34
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/tuple-cache-node.cc
M be/src/exec/tuple-cache-node.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/common/ThriftSerializationCtx.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java
M fe/src/main/java/org/apache/impala/planner/TupleCacheNode.java
M fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java
M tests/custom_cluster/test_tuple_cache.py
15 files changed, 514 insertions(+), 50 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/21541/1
--
To view, visit http://gerrit.cloudera.org:8080/21541
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibe298fff0f644ce931a2aa934ebb98f69aab9d34
Gerrit-Change-Number: 21541
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-13169: Specify cluster id before starting HiveServer2

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21540 )

Change subject: IMPALA-13169: Specify cluster id before starting HiveServer2
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21540
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d07ec01a04f8123b7ccca676ce744ac485f167c
Gerrit-Change-Number: 21540
Gerrit-PatchSet: 1
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 20 Jun 2024 04:45:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12754: [DOCS] External JDBC table support

2024-06-19 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21539 )

Change subject: IMPALA-12754: [DOCS] External JDBC table support
..


Patch Set 2:

(2 comments)

looks good to me.

http://gerrit.cloudera.org:8080/#/c/21539/2/docs/topics/impala_jdbc_external_table.xml
File docs/topics/impala_jdbc_external_table.xml:

http://gerrit.cloudera.org:8080/#/c/21539/2/docs/topics/impala_jdbc_external_table.xml@157
PS2, Line 157:
add a line break before "complex data types".


http://gerrit.cloudera.org:8080/#/c/21539/2/docs/topics/impala_jdbc_external_table.xml@157
PS2, Line 157: types
data types



--
To view, visit http://gerrit.cloudera.org:8080/21539
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5360389037ae9ee675ab406d87617d55d476bf8f
Gerrit-Change-Number: 21539
Gerrit-PatchSet: 2
Gerrit-Owner: Jankiram Balakrishnan 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jankiram Balakrishnan 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 20 Jun 2024 04:32:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12754: [DOCS] External JDBC table support

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21539 )

Change subject: IMPALA-12754: [DOCS] External JDBC table support
..


Patch Set 2: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/771/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/21539
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5360389037ae9ee675ab406d87617d55d476bf8f
Gerrit-Change-Number: 21539
Gerrit-PatchSet: 2
Gerrit-Owner: Jankiram Balakrishnan 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jankiram Balakrishnan 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 20 Jun 2024 03:52:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12754: [DOCS] External JDBC table support

2024-06-19 Thread Jankiram Balakrishnan (Code Review)
Jankiram Balakrishnan has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21539 )

Change subject: IMPALA-12754: [DOCS] External JDBC table support
..


Patch Set 2:

(14 comments)

Thanks for your feedback, Wenzhe. I have incorporated your comments and 
published another patch.

http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml
File docs/topics/impala_jdbc_external_table.xml:

http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@22
PS1, Line 22: Using Impala to Query External JDBC Data Sour
> Change title as "Using Impala to Query External JDBC Data Sources"
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@40
PS1, Line 40: external JDBC
> external JDBC table
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@41
PS1, Line 41: EXTERNAL T
> key word 'EXTERNAL' is mandatory for creating JDBC table, remove square bra
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@70
PS1, Line 70: 2/ change to default Postgres port: 5432
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@93
PS1, Line 93: 50/<
> change to Impala default HS2 port: 21050
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@98
PS1, Line 98:
> add a line break
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@107
PS1, Line 107:
> or IMPALA
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@109
PS1, Line 109: codeph>:
> hostname/IP address
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@129
PS1, Line 129: f the Ja
> change to URI
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@142
PS1, Line 142:
> an Impala external
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@154
PS1, Line 154: 
> Impala external
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@156
PS1, Line 156: _5bc">
 : Following column types are not 
support
> Following column data types are not supported:
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@194
PS1, Line 194:
> replace it with one space
Done


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@211
PS1, Line 211: eading extern
> external JDBC
Done



--
To view, visit http://gerrit.cloudera.org:8080/21539
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5360389037ae9ee675ab406d87617d55d476bf8f
Gerrit-Change-Number: 21539
Gerrit-PatchSet: 2
Gerrit-Owner: Jankiram Balakrishnan 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jankiram Balakrishnan 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 20 Jun 2024 03:47:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12754: [DOCS] External JDBC table support

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21539 )

Change subject: IMPALA-12754: [DOCS] External JDBC table support
..


Patch Set 2:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/771/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/21539
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5360389037ae9ee675ab406d87617d55d476bf8f
Gerrit-Change-Number: 21539
Gerrit-PatchSet: 2
Gerrit-Owner: Jankiram Balakrishnan 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 20 Jun 2024 03:44:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12754: [DOCS] External JDBC table support

2024-06-19 Thread Jankiram Balakrishnan (Code Review)
Hello Abhishek Rawat, gaurav singh, Wenzhe Zhou, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21539

to look at the new patch set (#2).

Change subject: IMPALA-12754: [DOCS] External JDBC table support
..

IMPALA-12754: [DOCS] External JDBC table support

Created the docs for Impala external JDBC table support

Change-Id: I5360389037ae9ee675ab406d87617d55d476bf8f
---
M docs/impala.ditamap
A docs/topics/impala_jdbc_external_table.xml
2 files changed, 230 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/21539/2
--
To view, visit http://gerrit.cloudera.org:8080/21539
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5360389037ae9ee675ab406d87617d55d476bf8f
Gerrit-Change-Number: 21539
Gerrit-PatchSet: 2
Gerrit-Owner: Jankiram Balakrishnan 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 


[Impala-ASF-CR] IMPALA-13136: Refactor AnalyzedFunctionCallExpr (for Calcite)

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21525 )

Change subject: IMPALA-13136: Refactor AnalyzedFunctionCallExpr (for Calcite)
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16403/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21525
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ideb662d9c7536659cb558bf62baec29c82217aa2
Gerrit-Change-Number: 21525
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Thu, 20 Jun 2024 03:25:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13136: Refactor AnalyzedFunctionCallExpr (for Calcite)

2024-06-19 Thread Steve Carlin (Code Review)
Hello Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21525

to look at the new patch set (#3).

Change subject: IMPALA-13136: Refactor AnalyzedFunctionCallExpr (for Calcite)
..

IMPALA-13136: Refactor AnalyzedFunctionCallExpr (for Calcite)

The analyze method is now called after the Expr is constructed.

This code is more in line with the existing way that Impala
constructs the Expr object.

Change-Id: Ideb662d9c7536659cb558bf62baec29c82217aa2
---
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/AnalyzedFunctionCallExpr.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/AnalyzedNullLiteral.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/RexCallConverter.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/RexLiteralConverter.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/util/CreateExprVisitor.java
5 files changed, 38 insertions(+), 74 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/21525/3
--
To view, visit http://gerrit.cloudera.org:8080/21525
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ideb662d9c7536659cb558bf62baec29c82217aa2
Gerrit-Change-Number: 21525
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-13136: Refactor AnalyzedFunctionCallExpr (for Calcite)

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21525 )

Change subject: IMPALA-13136: Refactor AnalyzedFunctionCallExpr (for Calcite)
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10741/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21525
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ideb662d9c7536659cb558bf62baec29c82217aa2
Gerrit-Change-Number: 21525
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Thu, 20 Jun 2024 03:01:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12754: [DOCS] External JDBC table support

2024-06-19 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21539 )

Change subject: IMPALA-12754: [DOCS] External JDBC table support
..


Patch Set 1:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml
File docs/topics/impala_jdbc_external_table.xml:

http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@22
PS1, Line 22: Support for Impala external JDBC data sources
Change title as "Using Impala to Query External JDBC Data Sources"


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@40
PS1, Line 40: external table
external JDBC table


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@41
PS1, Line 41: [EXTERNAL]
key word 'EXTERNAL' is mandatory for creating JDBC table, remove square brackets


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@70
PS1, Line 70: port
change to default Postgres port: 5432


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@93
PS1, Line 93: port
change to Impala default HS2 port: 21050


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@98
PS1, Line 98:
add a line break


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@107
PS1, Line 107: IMPALA
or IMPALA


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@109
PS1, Line 109: hostname
hostname/IP address


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@129
PS1, Line 129: Location
change to URI


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@142
PS1, Line 142: an external
an Impala external


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@154
PS1, Line 154: external
Impala external


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@156
PS1, Line 156: Unsupported column data types include char, varchar, binary, 
complex
 : data types - struct, map, array, and nested 
type
Following column data types are not supported:
char, varchar, binary,
complex data types - struct, map, array, and nested type


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@194
PS1, Line 194:
replace it with one space


http://gerrit.cloudera.org:8080/#/c/21539/1/docs/topics/impala_jdbc_external_table.xml@211
PS1, Line 211: JDBC external
external JDBC



--
To view, visit http://gerrit.cloudera.org:8080/21539
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5360389037ae9ee675ab406d87617d55d476bf8f
Gerrit-Change-Number: 21539
Gerrit-PatchSet: 1
Gerrit-Owner: Jankiram Balakrishnan 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 20 Jun 2024 00:42:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13169: Specify cluster id before starting HiveServer2

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21540 )

Change subject: IMPALA-13169: Specify cluster id before starting HiveServer2
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16402/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21540
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d07ec01a04f8123b7ccca676ce744ac485f167c
Gerrit-Change-Number: 21540
Gerrit-PatchSet: 1
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 20 Jun 2024 00:01:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13169: Specify cluster id before starting HiveServer2

2024-06-19 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21540


Change subject: IMPALA-13169: Specify cluster id before starting HiveServer2
..

IMPALA-13169: Specify cluster id before starting HiveServer2

After HIVE-28324, in order to start HiveServer2, it is required that
the cluster id has to be passed to HiveServer2, either via the
environment variable 'HIVE_CLUSTER_ID', or the command line Java
property 'hive.cluster.id'. This patch exports HIVE_CLUSTER_ID before
starting HiveServer2.

Testing:
 - Manually verified that a HiveServer2 including HIVE-28324 could be
   started after this patch.
 - Verified that this patch passed the core tests.

Change-Id: I9d07ec01a04f8123b7ccca676ce744ac485f167c
---
M testdata/bin/run-hive-server.sh
1 file changed, 1 insertion(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/21540/1
--
To view, visit http://gerrit.cloudera.org:8080/21540
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9d07ec01a04f8123b7ccca676ce744ac485f167c
Gerrit-Change-Number: 21540
Gerrit-PatchSet: 1
Gerrit-Owner: Fang-Yu Rao 


[Impala-ASF-CR] IMPALA-13169: Specify cluster id before starting HiveServer2

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21540 )

Change subject: IMPALA-13169: Specify cluster id before starting HiveServer2
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10740/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21540
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d07ec01a04f8123b7ccca676ce744ac485f167c
Gerrit-Change-Number: 21540
Gerrit-PatchSet: 1
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 19 Jun 2024 23:37:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16401/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 11
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 
Gerrit-Comment-Date: Wed, 19 Jun 2024 20:02:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-19 Thread Saurabh Katiyal (Code Review)
Saurabh Katiyal has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..

IMPALA-12216: Print timestamp for impala-shell errors

This change will print timestamp of an exception or warning
occurred during execution of a query via impala-shell.

Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
---
M shell/impala_client.py
M shell/impala_shell.py
2 files changed, 42 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/21426/11
--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 11
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 


[Impala-ASF-CR] IMPALA-12940: Added filtering capability for Calcite planner

2024-06-19 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21498 )

Change subject: IMPALA-12940: Added filtering capability for Calcite planner
..

IMPALA-12940: Added filtering capability for Calcite planner

The Filter RelNode is now handled in the Calcite planner.

The parsing and analysis is done by Calcite so there were no
changes added to that portion. The ImpalaFilterRel class was
created to handled the conversion of the Calcite LogicalFilter
to create a filter condition within the Impala plan nodes.

There is no explicit filter plan node in Impala. Instead, the
filter condition attaches itself to an existing plan node. The
filter condition gets passed into the children plan nodes through
the ParentPlanRelContext.

The ExprConjunctsConverter class is responsible for creating the
filter Expr list that is used. The list contains separate AND
conditions that are on the top level.

Change-Id: If104bf1cd801d5ee92dd7e43d398a21a18be5d97
Reviewed-on: http://gerrit.cloudera.org:8080/21498
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
Reviewed-by: Csaba Ringhofer 
---
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/FunctionResolver.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/RexCallConverter.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ConvertToImpalaRelRules.java
A 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaFilterRel.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaProjectRel.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ParentPlanRelContext.java
A 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/util/ExprConjunctsConverter.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteOptimizer.java
M testdata/workloads/functional-query/queries/QueryTest/calcite.test
10 files changed, 275 insertions(+), 13 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified
  Csaba Ringhofer: Looks good to me, but someone else must approve

--
To view, visit http://gerrit.cloudera.org:8080/21498
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If104bf1cd801d5ee92dd7e43d398a21a18be5d97
Gerrit-Change-Number: 21498
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 


[Impala-ASF-CR] IMPALA-13120: Load failed table without need for manual invalidate

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21478 )

Change subject: IMPALA-13120: Load failed table without need for manual 
invalidate
..


Patch Set 3:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10739/


--
To view, visit http://gerrit.cloudera.org:8080/21478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia882fdd865ef716351be7f1eaf203a9fb04c1c15
Gerrit-Change-Number: 21478
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 19 Jun 2024 16:54:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13077: Fix selectivity estimation for SEMI JOIN

2024-06-19 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21516 )

Change subject: IMPALA-13077: Fix selectivity estimation for SEMI JOIN
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21516/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/21516/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java@747
PS4, Line 747: lhsDivisor
Doesn't having 1 here (and in other formulas) mean that the selectivity will be 
>1 most of the times? That doesn't make sense for a semi join. I think that it 
would be clearer to skip the predicate in these cases.

An idea to simplify this is to move earlier to initialize variables like 
isAntiJoin, inputNdv, filterNdv and "continue" if inputNdv is -1.


http://gerrit.cloudera.org:8080/#/c/21516/4/testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test:

http://gerrit.cloudera.org:8080/#/c/21516/4/testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test@486
PS4, Line 486: where ss_sold_date_sk=(
 :   select min(d_date_sk) + 1000 from tpcds.date_dim)
This doesn't look a like a perfect test for me as it could be also fixed by 
setting the ndv of the scalar subquery. If such a fix goes in then it will no 
longer verify the current patch.



--
To view, visit http://gerrit.cloudera.org:8080/21516
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c799df535d764c3f87ededef1c48eaa103293a0
Gerrit-Change-Number: 21516
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 19 Jun 2024 16:44:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12869: Shorten argument list to enable local catalog mode

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21443 )

Change subject: IMPALA-12869: Shorten argument list to enable local catalog mode
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16400/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21443
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iab5f3613b39a8473ba1f6ab7cb2634d87b808142
Gerrit-Change-Number: 21443
Gerrit-PatchSet: 3
Gerrit-Owner: Anshula Jain 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 19 Jun 2024 16:36:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12940: Added filtering capability for Calcite planner

2024-06-19 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21498 )

Change subject: IMPALA-12940: Added filtering capability for Calcite planner
..


Patch Set 2: Code-Review+1

I also went through the patch, lgtm


--
To view, visit http://gerrit.cloudera.org:8080/21498
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If104bf1cd801d5ee92dd7e43d398a21a18be5d97
Gerrit-Change-Number: 21498
Gerrit-PatchSet: 2
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Wed, 19 Jun 2024 16:18:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12869: Shorten argument list to enable local catalog mode

2024-06-19 Thread Anshula Jain (Code Review)
Anshula Jain has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/21443 )

Change subject: IMPALA-12869: Shorten argument list to enable local catalog mode
..

IMPALA-12869: Shorten argument list to enable local catalog mode

Currently, a longer command is used to enable local catalog mode
bin/start-impala-cluster.py --catalogd_args=--catalog_topic_mode=minimal 
--impalad_args=--use_local_catalog

With this JIRA we shorten the command to
bin/start-impala-cluster.py --use_local_catalog

Change-Id: Iab5f3613b39a8473ba1f6ab7cb2634d87b808142
---
M bin/start-impala-cluster.py
1 file changed, 29 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21443/3
--
To view, visit http://gerrit.cloudera.org:8080/21443
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iab5f3613b39a8473ba1f6ab7cb2634d87b808142
Gerrit-Change-Number: 21443
Gerrit-PatchSet: 3
Gerrit-Owner: Anshula Jain 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12869: Shorten argument list to enable local catalog mode

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21443 )

Change subject: IMPALA-12869: Shorten argument list to enable local catalog mode
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21443/3/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

http://gerrit.cloudera.org:8080/#/c/21443/3/bin/start-impala-cluster.py@203
PS3, Line 203: #Adding this to shorten the command to enable local catalog mode 
IMPALA-12869
flake8: E265 block comment should start with '# '


http://gerrit.cloudera.org:8080/#/c/21443/3/bin/start-impala-cluster.py@204
PS3, Line 204:
flake8: W291 trailing whitespace


http://gerrit.cloudera.org:8080/#/c/21443/3/bin/start-impala-cluster.py@204
PS3, Line 204: parser.add_option("--use_local_catalog", 
dest="use_local_catalog",
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/21443
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iab5f3613b39a8473ba1f6ab7cb2634d87b808142
Gerrit-Change-Number: 21443
Gerrit-PatchSet: 3
Gerrit-Owner: Anshula Jain 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 19 Jun 2024 16:12:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12709: Add support for hierarchical metastore event processing

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21031 )

Change subject: IMPALA-12709: Add support for hierarchical metastore event 
processing
..


Patch Set 20: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6
Gerrit-Change-Number: 21031
Gerrit-PatchSet: 20
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 19 Jun 2024 16:02:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12754: [DOCS] External JDBC table support

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21539 )

Change subject: IMPALA-12754: [DOCS] External JDBC table support
..


Patch Set 1: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/770/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/21539
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5360389037ae9ee675ab406d87617d55d476bf8f
Gerrit-Change-Number: 21539
Gerrit-PatchSet: 1
Gerrit-Owner: Jankiram Balakrishnan 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 19 Jun 2024 15:22:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12754: [DOCS] External JDBC table support

2024-06-19 Thread Jankiram Balakrishnan (Code Review)
Jankiram Balakrishnan has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21539


Change subject: IMPALA-12754: [DOCS] External JDBC table support
..

IMPALA-12754: [DOCS] External JDBC table support

Created the docs for Impala external JDBC table support

Change-Id: I5360389037ae9ee675ab406d87617d55d476bf8f
---
M docs/impala.ditamap
A docs/topics/impala_jdbc_external_table.xml
2 files changed, 231 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/21539/1
--
To view, visit http://gerrit.cloudera.org:8080/21539
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5360389037ae9ee675ab406d87617d55d476bf8f
Gerrit-Change-Number: 21539
Gerrit-PatchSet: 1
Gerrit-Owner: Jankiram Balakrishnan 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12754: [DOCS] External JDBC table support

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21539 )

Change subject: IMPALA-12754: [DOCS] External JDBC table support
..


Patch Set 1:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/770/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/21539
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5360389037ae9ee675ab406d87617d55d476bf8f
Gerrit-Change-Number: 21539
Gerrit-PatchSet: 1
Gerrit-Owner: Jankiram Balakrishnan 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 19 Jun 2024 15:14:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12732: Add support for MERGE statements for Iceberg tables

2024-06-19 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21423 )

Change subject: IMPALA-12732: Add support for MERGE statements for Iceberg 
tables
..


Patch Set 7:

(18 comments)

I managed to finish the first iteration on the code. Thanks for all this 
effort, Peter!

http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.h
File be/src/exec/iceberg-merge-node.h:

http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.h@61
PS7, Line 61:   ScalarExpr* row_present_ = nullptr;
nit: this section below would be more readable with linebreaks included.


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.h@103
PS7, Line 103:   Status EvaluateCases(RowBatch* output_batch);
nit: without linebreaks this part of the class lacks any structure and it's not 
that easy to read.


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.h@148
PS7, Line 148:   std::vector filter_conjuncts_;
nit: I believe that the order of the members is constants first, then functions 
then variables


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.h@177
PS7, Line 177: return TIcebergMergeSinkAction::BOTH;
DCHECK for type_ == BOTH?


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.cc
File be/src/exec/iceberg-merge-node.cc:

http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.cc@43
PS7, Line 43:   auto& tmerge_cases = tnode.merge_node.cases;
No need to create a variable for this, only used once. We can use the 
tnode.merge_node.cases directly in the for loop.


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.cc@48
PS7, Line 48: RETURN_IF_ERROR(merge_case_plan->Init(tmerge_case, state, 
row_descriptor_));
Since IcebergMergeCasePlane is not an inherited class, I think we can get rid 
of the Init() fn and use the constructor for initialization. Additionally, we 
might be able to make the member variables const if populated in the 
constructor.


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.cc@67
PS7, Line 67: RowDescriptor* row_desc
row_desc seems a simple 'in' parameter. should be const ref


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.cc@92
PS7, Line 92:   DCHECK(pnode.merge_action_tuple_id_ != -1);
nit: some linebreaks would be nice to increase readability with the structure.


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.cc@135
PS7, Line 135:   RETURN_IF_ERROR(child(0)->Open(state));
Apparnetly, in Preapre() we didn't have to call Prepare for the child, bit in 
Open we do have to delegate it to the child. Could you help me understand, why?


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.cc@206
PS7, Line 206: { continue; }
nit: no need for the braces


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.cc@217
PS7, Line 217:   TupleRow* dst_row = output_batch->GetRow(dst_row_idx);
You can merge this line with the one above


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-node.cc@226
PS7, Line 226:   TIcebergMergeSinkAction::type action = 
merge_case->SinkAction();
You declare this too early. Even no need for a variable since the expression is 
short enough, you can use 'merge_case->SinkAction()' at L242.


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-sink.cc
File be/src/exec/iceberg-merge-sink.cc:

http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-sink.cc@43
PS7, Line 43:   DCHECK(tsink.child_data_sinks[1].table_sink.action == 
TSinkAction::DELETE);
nit: linebreak after the last DCHECK pls


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-sink.cc@48
PS7, Line 48:   const TDataSink& delete_sink = tsink.child_data_sinks[1];
nit: linebreak before this line to separate the code between insert_sink and 
delete_sink


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-sink.cc@55
PS7, Line 55:   auto merge_action = tsink.output_exprs[0];
this could be merged into the line below. it'd still fit into one line


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-sink.cc@81
PS7, Line 81:   profile()->AddChild(insert_sink_->profile());
I recall that in some use-cases it's not neccessary to create both sinks. 
Should we check for nullness here for the sinks? I see other places too where 
we call functions on the sink pointers without checking for nullness.


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-sink.cc@87
PS7, Line 87:   merge_action_evaluator_ = output_expr_evals_[0];
we could do this in the constructor


http://gerrit.cloudera.org:8080/#/c/21423/7/be/src/exec/iceberg-merge-sink.cc@108
PS7, Line 108: auto index = delete_rows.AddRow();
 :   auto 

[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21045 )

Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong 
number
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16399/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
Gerrit-Change-Number: 21045
Gerrit-PatchSet: 11
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 19 Jun 2024 14:09:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13168: Add README file for setting up Trino

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21538 )

Change subject: IMPALA-13168: Add README file for setting up Trino
..


Patch Set 2:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/16398/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/21538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9fea891074223475a57c8f49f788924a0929b12
Gerrit-Change-Number: 21538
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 19 Jun 2024 14:06:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21045 )

Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong 
number
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16397/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
Gerrit-Change-Number: 21045
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 19 Jun 2024 14:02:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

2024-06-19 Thread Anonymous Coward (Code Review)
cclive1...@gmail.com has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/21045 )

Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong 
number
..

IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

The description of events-skipped metric is wrong. Some cases in Add partition
event ,the metric will also be increased, besides for some other cases like 
alter
partition the event is skipped and the log is printed but the events-skipped 
metric
is not increased.

Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
---
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
4 files changed, 177 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/21045/11
--
To view, visit http://gerrit.cloudera.org:8080/21045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
Gerrit-Change-Number: 21045
Gerrit-PatchSet: 11
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21045 )

Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong 
number
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21045/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21045/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2858
PS11, Line 2858:   FileMetadataLoadOpts.FORCE_LOAD, 
getEventType().toString() + " event", true);
line too long (95 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/21045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
Gerrit-Change-Number: 21045
Gerrit-PatchSet: 11
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 19 Jun 2024 13:46:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13168: Add README file for setting up Trino

2024-06-19 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21538


Change subject: IMPALA-13168: Add README file for setting up Trino
..

IMPALA-13168: Add README file for setting up Trino

The Impala repository contains scripts that make it easy to set up Trino
in the development environment. This commit adds the TRINO-README.md
file that describes how they can be used.

Change-Id: Ic9fea891074223475a57c8f49f788924a0929b12
---
A testdata/bin/TRINO-README.md
1 file changed, 25 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/21538/2
--
To view, visit http://gerrit.cloudera.org:8080/21538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic9fea891074223475a57c8f49f788924a0929b12
Gerrit-Change-Number: 21538
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

2024-06-19 Thread Anonymous Coward (Code Review)
cclive1...@gmail.com has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/21045 )

Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong 
number
..

IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

The description of events-skipped metric is wrong. Some cases in Add partition
event ,the metric will also be increased, besides for some other cases like 
alter
partition the event is skipped and the log is printed but the events-skipped 
metric
is not increased.

Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
---
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
4 files changed, 177 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/21045/10
--
To view, visit http://gerrit.cloudera.org:8080/21045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
Gerrit-Change-Number: 21045
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21045 )

Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong 
number
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21045/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21045/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2862
PS10, Line 2862:   FileMetadataLoadOpts.FORCE_LOAD, 
getEventType().toString() + " event", true);
line too long (95 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/21045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
Gerrit-Change-Number: 21045
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 19 Jun 2024 13:39:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13120: Load failed table without need for manual invalidate

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21478 )

Change subject: IMPALA-13120: Load failed table without need for manual 
invalidate
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10739/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia882fdd865ef716351be7f1eaf203a9fb04c1c15
Gerrit-Change-Number: 21478
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 19 Jun 2024 11:51:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21435 )

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16396/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 9
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 19 Jun 2024 11:43:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-19 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21435 )

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..


Patch Set 9: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 9
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 19 Jun 2024 11:25:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-19 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21435 )

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21435/8/be/src/exec/iceberg-delete-builder.cc
File be/src/exec/iceberg-delete-builder.cc:

http://gerrit.cloudera.org:8080/#/c/21435/8/be/src/exec/iceberg-delete-builder.cc@319
PS8, Line 319: )
> Now that this function doesn't return void, I think it helps readability to
Done


http://gerrit.cloudera.org:8080/#/c/21435/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-query-v2.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-query-v2.test:

http://gerrit.cloudera.org:8080/#/c/21435/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-query-v2.test@5
PS8, Line 5: UNION node which behaves
   : # differently
> Does it mean that the result will be different? Or only that we can't asser
The results can be different because how rand() is being used in the fragment 
instance.

I haven't insvestigated all the details, but we get different results when both 
data files are scanned in the same fragment VS the data files are scheduled on 
different nodes.



--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 9
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 19 Jun 2024 11:18:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-19 Thread Zoltan Borok-Nagy (Code Review)
Hello Daniel Becker, Gabor Kaszab, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21435

to look at the new patch set (#9).

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..

IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

When there are lots of delete records the IcebergDeleteBuilder can
become a bottleneck. Since the left side of the JOIN is blocked on
the build side any improvement we make here significantly improves
Iceberg V2 table scanning.

Improvements of this patch:

* Use a vector of vectors to collect the position delete records.
  This way we can avoid large re-allocations and copyings.
* Insert large ranges from the build batches into the collected
  delete records instead of doing it one-by-one.

Measurements

Local measurement with 824 Million position delete records:
JOIN BUILD: ~32s -> ~14s (6s is the final sorting)

40-node cluster with 68.5 Billion position delete records:
JOIN BUILD: 4m15s -> 1m45s (1m7s is the final sorting)

Parallelization of the final sort will be added in a follow-up CR.

Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
---
M be/src/exec/iceberg-delete-builder.cc
M be/src/exec/iceberg-delete-builder.h
A testdata/workloads/functional-query/queries/QueryTest/iceberg-query-v2.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test
M tests/query_test/test_scanners.py
5 files changed, 164 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/21435/9
--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 9
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12709: Add support for hierarchical metastore event processing

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21031 )

Change subject: IMPALA-12709: Add support for hierarchical metastore event 
processing
..


Patch Set 20:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10738/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6
Gerrit-Change-Number: 21031
Gerrit-PatchSet: 20
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 19 Jun 2024 10:57:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

2024-06-19 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21492 )

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
..

IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

Before this commit, only read support was implemented
(convert_kudu_utc_timestamps=true). This change adds write support:
if write_kudu_utc_timestamps=true, then timestamps are converted
from local time to UTC during DMLs to Kudu. In case of
ambiguous conversions (DST changes) the earlier possible UTC
timestamp is written.

All DMLs supported with Kudu tables are affected:
INSERT, UPSERT, UPDATE, DELETE

To be able to read back Kudu tables written by Impala correctly
convert_kudu_utc_timestamps and write_kudu_utc_timestamps need to
have the same value. Having the same value in the two query option
is also critical for UPDATE/DELETE if the primary key contains a
timestamp column - these operations do a scan first (affected by
convert_kudu_utc_timestamps) and then use the keys from the scan to
select updated/deleted rows (affected by write_kudu_utc_timestamps).

The conversion is implemented by adding to_utc_timestamp() to inserted
timestamp expressions during planning. This allows doing the same
conversion during the pre-insert sorting and partitioning.
Read support is implemented differently - in that case the plan is not
changed and the scanner does the conversion.

Other changes:
- Before this change, verification of tests with TIMESTAMP results
  were skipped when the file format is Kudu. This shouldn't be
  necessary so the skipping was removed.

Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Reviewed-on: http://gerrit.cloudera.org:8080/21492
Tested-by: Impala Public Jenkins 
Reviewed-by: Peter Rozsa 
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java
M fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java
M fe/src/main/java/org/apache/impala/util/ExprUtil.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test
M 
testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test
M 
testdata/workloads/functional-query/queries/QueryTest/kudu_runtime_filter_with_timestamp_conversion.test
M 
testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test
M tests/common/test_result_verifier.py
M tests/query_test/test_kudu.py
17 files changed, 391 insertions(+), 42 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Peter Rozsa: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zihao Ye 


[Impala-ASF-CR] IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

2024-06-19 Thread Peter Rozsa (Code Review)
Peter Rozsa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21492 )

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
..


Patch Set 9: Code-Review+2

Thank you, Csaba!


--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Wed, 19 Jun 2024 10:39:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-06-19 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21452 )

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..


Patch Set 9: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 9
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 19 Jun 2024 09:32:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-19 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21435 )

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..


Patch Set 8: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21435/8/be/src/exec/iceberg-delete-builder.cc
File be/src/exec/iceberg-delete-builder.cc:

http://gerrit.cloudera.org:8080/#/c/21435/8/be/src/exec/iceberg-delete-builder.cc@319
PS8, Line 319: )
Now that this function doesn't return void, I think it helps readability to add 
annotation for the return value "-> Status".


http://gerrit.cloudera.org:8080/#/c/21435/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-query-v2.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-query-v2.test:

http://gerrit.cloudera.org:8080/#/c/21435/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-query-v2.test@5
PS8, Line 5: UNION node which behaves
   : # differently
Does it mean that the result will be different? Or only that we can't assert 
specific values in the profile?



--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 19 Jun 2024 09:32:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12709: Add support for hierarchical metastore event processing

2024-06-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21031 )

Change subject: IMPALA-12709: Add support for hierarchical metastore event 
processing
..


Patch Set 20:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16395/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6
Gerrit-Change-Number: 21031
Gerrit-PatchSet: 20
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 19 Jun 2024 09:18:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12709: Add support for hierarchical metastore event processing

2024-06-19 Thread Anonymous Coward (Code Review)
Hello Quanlong Huang, cclive1...@gmail.com, Sai Hemanth Gantasala, Csaba 
Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21031

to look at the new patch set (#20).

Change subject: IMPALA-12709: Add support for hierarchical metastore event 
processing
..

IMPALA-12709: Add support for hierarchical metastore event processing

At present, metastore event processor is single threaded. Notification
events are processed sequentially with a maximum limit of 1000 events
fetched and processed in a single batch. Multiple locks are used to
address the concurrency issues that may arise when catalog DDL
operation processing and metastore event processing tries to
access/update the catalog objects concurrently. Waiting for a lock or
file metadata loading of a table can slow the event processing and can
affect the processing of other events following it. Those events may
not be dependent on the previous event. Altogether it takes a very
long time to synchronize all the HMS events.

Existing metastore event processing is turned into multi-level
event processing with enable_hierarchical_event_processing flag. It
is not enabled by default. Idea is to segregate the events based on
their dependency, maintain the order of events as they occur within
the dependency and process them independently as much as possible:
1. All the events of a table are processed in the same order they
   have occurred.
2. Events of different tables are processed in parallel.
3. When a database is altered, all the table events related to
   the database that occurred after the alter db event are processed
   only after the alter database event is processed.

Added a new hms_event_polling_interval_ms flag to support millisecond
precision event polling interval since the current flag in seconds.

Testing:
 - Executed existing end to end tests.
 - Added end-to-end test with enable_hierarchical_event_processing.
 - Added event processing performance tests. They are marked to skip.

Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M be/src/util/event-metrics.cc
M be/src/util/event-metrics.h
M common/thrift/BackendGflags.thrift
M common/thrift/JniCatalog.thrift
M common/thrift/metrics.json
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/TableWriteId.java
A fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java
A fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java
M fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTableWriteIdTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
A tests/custom_cluster/test_event_processing_perf.py
M tests/custom_cluster/test_events_custom_configs.py
M tests/util/event_processor_utils.py
30 files changed, 1,822 insertions(+), 97 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/21031/20
--
To view, visit http://gerrit.cloudera.org:8080/21031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6
Gerrit-Change-Number: 21031
Gerrit-PatchSet: 20
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala