[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..

IMPALA-13159: Fix query cancellation caused by statestore failover

A momentary inconsistent cluster membership state after statestore
failover results in query cancellation.
We already have code to handle inconsistent cluster membership after
statestore restarting by defining a post-recovery grace period. During
the grace period, don't update the current cluster membership so that
the inconsistent membership will not be used to cancel queries on
coordinators and executors.
This patch handles inconsistent cluster membership state after
statestore failover in the same way.

Testing:
 - Added a new test case to verify that inconsistent cluster
   membership after statestore failover will not result in query
   cancellation.
 - Fixed closing client issue for Catalogd HA test case
   test_catalogd_failover_with_sync_ddl when the test fails.
 - Passed core test.

Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Reviewed-on: http://gerrit.cloudera.org:8080/21520
Reviewed-by: Michael Smith 
Reviewed-by: Riza Suminto 
Tested-by: Impala Public Jenkins 
---
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M tests/custom_cluster/test_catalogd_ha.py
M tests/custom_cluster/test_statestored_ha.py
5 files changed, 160 insertions(+), 31 deletions(-)

Approvals:
  Michael Smith: Looks good to me, but someone else must approve
  Riza Suminto: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 13
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 12: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 18 Jun 2024 03:27:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21492 )

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
..


Patch Set 8: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10727/


--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Tue, 18 Jun 2024 02:47:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13136: Refactor AnalyzedFunctionCallExpr (for Calcite)

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21525 )

Change subject: IMPALA-13136: Refactor AnalyzedFunctionCallExpr (for Calcite)
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16378/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21525
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ideb662d9c7536659cb558bf62baec29c82217aa2
Gerrit-Change-Number: 21525
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 18 Jun 2024 00:46:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13136: Refactor AnalyzedFunctionCallExpr (for Calcite)

2024-06-17 Thread Steve Carlin (Code Review)
Steve Carlin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21525


Change subject: IMPALA-13136: Refactor AnalyzedFunctionCallExpr (for Calcite)
..

IMPALA-13136: Refactor AnalyzedFunctionCallExpr (for Calcite)

The analyze method is now called after the Expr is constructed.

This code is more in line with the existing way that Impala
constructs the Expr object.

Change-Id: Ideb662d9c7536659cb558bf62baec29c82217aa2
---
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/AnalyzedFunctionCallExpr.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/AnalyzedNullLiteral.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/RexCallConverter.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/RexLiteralConverter.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/util/CreateExprVisitor.java
5 files changed, 17 insertions(+), 45 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/21525/1
--
To view, visit http://gerrit.cloudera.org:8080/21525
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ideb662d9c7536659cb558bf62baec29c82217aa2
Gerrit-Change-Number: 21525
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10728/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 22:23:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 22:10:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21452 )

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10726/


--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Jun 2024 21:45:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21492 )

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10727/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Mon, 17 Jun 2024 21:33:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

2024-06-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21492 )

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test:

http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test@75
PS6, Line 75:
> Have you look into this extra whitespace?
Forgot to answer this one: yes, I looked into it and the test failed without it



--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Mon, 17 Jun 2024 21:33:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16377/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 21:18:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 12:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21520/10/tests/custom_cluster/test_statestored_ha.py
File tests/custom_cluster/test_statestored_ha.py:

http://gerrit.cloudera.org:8080/#/c/21520/10/tests/custom_cluster/test_statestored_ha.py@680
PS10, Line 680:   self.wait_for_state(handle, QueryState.RUNNING, 120, 
client)
> It's timeout for query to reach RUNNING state. Normally query should reach
Ack


http://gerrit.cloudera.org:8080/#/c/21520/10/tests/custom_cluster/test_statestored_ha.py@710
PS10, Line 710:   assert (not 
statestore_service_0.get_metric_value("statestore.active-status")), \
> Changed to wait_for_metric_value
Ack


http://gerrit.cloudera.org:8080/#/c/21520/10/tests/custom_cluster/test_statestored_ha.py@717
PS10, Line 717:   self.wait_for_state(handle, QueryState.RUNNING, 120, 
client)
> changed to 120
Ack


http://gerrit.cloudera.org:8080/#/c/21520/10/tests/custom_cluster/test_statestored_ha.py@736
PS10, Line 736:   self.wait_for_state(handle, QueryState.EXCEPTION, 
timeout_s, client)
> changed to use wait_for_state(EXCEPTION)
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 20:57:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 12: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 20:56:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..

IMPALA-13159: Fix query cancellation caused by statestore failover

A momentary inconsistent cluster membership state after statestore
failover results in query cancellation.
We already have code to handle inconsistent cluster membership after
statestore restarting by defining a post-recovery grace period. During
the grace period, don't update the current cluster membership so that
the inconsistent membership will not be used to cancel queries on
coordinators and executors.
This patch handles inconsistent cluster membership state after
statestore failover in the same way.

Testing:
 - Added a new test case to verify that inconsistent cluster
   membership after statestore failover will not result in query
   cancellation.
 - Fixed closing client issue for Catalogd HA test case
   test_catalogd_failover_with_sync_ddl when the test fails.
 - Passed core test.

Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
---
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M tests/custom_cluster/test_catalogd_ha.py
M tests/custom_cluster/test_statestored_ha.py
5 files changed, 160 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/21520/12
--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom()

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21501 )

Change subject: IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom()
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21501
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6a1d03d65ec4339a0f33e69ff29abdd8cc3e3067
Gerrit-Change-Number: 21501
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Mon, 17 Jun 2024 19:38:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom()

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21501 )

Change subject: IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom()
..

IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom()

In StringVal::CopyFrom(), we take the 'len' parameter as a size_t, which
is usually a 64-bit unsigned integer. We pass it to the constructor of
StringVal, which takes it as an int, which is usually a 32-bit signed
integer. The constructor then allocates memory for the length using the
int value, but afterwards in CopyFrom(), we copy the buffer with the
size_t length. If size_t is indeed 64 bits and int is 32 bits, and the
value is truncated, we may copy more bytes that what we have allocated
for the destination.

Note that in the constructor of StringVal it is checked whether the
length is greater than 1GB, but if the value is truncated because of the
type conversion, the check doesn't necessarily catch it as the truncated
value may be small.

This change fixes the problem by doing the length check with 64 bit
integers in StringVal::CopyFrom().

Testing:
 - added unit tests for StringVal::CopyFrom() in udf-test.cc.

Change-Id: I6a1d03d65ec4339a0f33e69ff29abdd8cc3e3067
Reviewed-on: http://gerrit.cloudera.org:8080/21501
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/udf/udf-test.cc
M be/src/udf/udf.cc
M be/src/udf/udf.h
3 files changed, 89 insertions(+), 17 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21501
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6a1d03d65ec4339a0f33e69ff29abdd8cc3e3067
Gerrit-Change-Number: 21501
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Peter Rozsa 


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21520/10/tests/custom_cluster/test_statestored_ha.py
File tests/custom_cluster/test_statestored_ha.py:

http://gerrit.cloudera.org:8080/#/c/21520/10/tests/custom_cluster/test_statestored_ha.py@680
PS10, Line 680:   self.wait_for_state(handle, QueryState.RUNNING, 1000)
This waits for 1000 seconds? That seems excessive. I'd try 10s to handle slower 
test environments.


http://gerrit.cloudera.org:8080/#/c/21520/10/tests/custom_cluster/test_statestored_ha.py@710
PS10, Line 710:   assert (not 
statestore_service_0.get_metric_value("statestore.active-status")), \
Maybe wait_for_metric_value would be appropriate here instead of a sleep.


http://gerrit.cloudera.org:8080/#/c/21520/10/tests/custom_cluster/test_statestored_ha.py@717
PS10, Line 717:   self.wait_for_state(handle, QueryState.RUNNING, 1000)
nit: Again waiting 1000 seconds.


http://gerrit.cloudera.org:8080/#/c/21520/10/tests/custom_cluster/test_statestored_ha.py@736
PS10, Line 736:   while (time.time() - start_time < timeout_s):
nit: I think you could also use wait_for_state(EXCEPTION), then rely on your 
assert that time.time()-start_time > timeout_s.



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 18:56:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16376/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 9
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 
Gerrit-Comment-Date: Mon, 17 Jun 2024 18:36:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21520/9/be/src/statestore/statestore-subscriber.cc
File be/src/statestore/statestore-subscriber.cc:

http://gerrit.cloudera.org:8080/#/c/21520/9/be/src/statestore/statestore-subscriber.cc@1098
PS9, Line 1098:   bool in_failover_grace_period = 
MilliSecondsSinceLastFailover()
> At startup last_failover_time_ is zero, so for the first statestore_subscri
At startup, last_failover_time_ is zero. If last_failover_time_ is not updated, 
MilliSecondsSinceLastFailover() return a very big number and it must be greater 
than FLAGS_statestore_subscriber_recovery_grace_period_ms so we don't need to 
check if last_failover_time_ is not zero.



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 18:32:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16375/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 8
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 
Gerrit-Comment-Date: Mon, 17 Jun 2024 18:30:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-12093: impala-shell should preserver all cookies by default

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19827 )

Change subject: WIP IMPALA-12093: impala-shell should preserver all cookies by 
default
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16374/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19827
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic81f790288460b086ab218e6701e8115a996dfa7
Gerrit-Change-Number: 19827
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Mon, 17 Jun 2024 18:23:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 9:

(1 comment)

Quick question

http://gerrit.cloudera.org:8080/#/c/21520/9/be/src/statestore/statestore-subscriber.cc
File be/src/statestore/statestore-subscriber.cc:

http://gerrit.cloudera.org:8080/#/c/21520/9/be/src/statestore/statestore-subscriber.cc@1098
PS9, Line 1098:   bool in_failover_grace_period = 
MilliSecondsSinceLastFailover()
At startup last_failover_time_ is zero, so for the first 
statestore_subscriber_recovery_grace_period_ms we are considered to be in the 
failover grace period? Could that be a problem?



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 18:23:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-17 Thread Saurabh Katiyal (Code Review)
Saurabh Katiyal has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..

IMPALA-12216: Print timestamp for impala-shell errors

This change will print timestamp of an exception or warning
occurred during execution of a query via impala-shell.

Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
---
M shell/impala_client.py
M shell/impala_shell.py
2 files changed, 30 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/21426/9
--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 9
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16373/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 18:13:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-17 Thread Saurabh Katiyal (Code Review)
Saurabh Katiyal has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..

IMPALA-12216: Print timestamp for impala-shell errors

This change will print timestamp of an exception or warning
occurred during execution of a query via impala-shell.

Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
---
M shell/impala_client.py
M shell/impala_shell.py
2 files changed, 30 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/21426/8
--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 8
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21426/8/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/21426/8/shell/impala_client.py@1615
PS8, Line 1615: ,
flake8: E231 missing whitespace after ','


http://gerrit.cloudera.org:8080/#/c/21426/8/shell/impala_client.py@1616
PS8, Line 1616:
flake8: E203 whitespace before ','


http://gerrit.cloudera.org:8080/#/c/21426/8/shell/impala_client.py@1621
PS8, Line 1621: ,
flake8: E231 missing whitespace after ','



--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 8
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 
Gerrit-Comment-Date: Mon, 17 Jun 2024 18:06:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-12093: impala-shell should preserver all cookies by default

2024-06-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19827


Change subject: WIP IMPALA-12093: impala-shell should preserver all cookies by 
default
..

WIP IMPALA-12093: impala-shell should preserver all cookies by default

Change-Id: Ic81f790288460b086ab218e6701e8115a996dfa7
---
M shell/ImpalaHttpClient.py
M shell/cookie_util.py
M shell/option_parser.py
3 files changed, 29 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/19827/2
--
To view, visit http://gerrit.cloudera.org:8080/19827
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic81f790288460b086ab218e6701e8115a996dfa7
Gerrit-Change-Number: 19827
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 10: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 17:55:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21520/9/tests/custom_cluster/test_statestored_ha.py
File tests/custom_cluster/test_statestored_ha.py:

http://gerrit.cloudera.org:8080/#/c/21520/9/tests/custom_cluster/test_statestored_ha.py@704
PS9, Line 704:
> nit: why sleep(1) when wait_until_ready=True above?
add a buffer time for metrics to be updated on statestored.


http://gerrit.cloudera.org:8080/#/c/21520/9/tests/custom_cluster/test_statestored_ha.py@705
PS9, Line 705:   # Restart original active statestored. Verify that the 
statestored does not resume
 :   # its active role.
> nit: These assertion and others about active-status are not really straight
added error messages for assertions.



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 17:49:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..

IMPALA-13159: Fix query cancellation caused by statestore failover

A momentary inconsistent cluster membership state after statestore
failover results in query cancellation.
We already have code to handle inconsistent cluster membership after
statestore restarting by defining a post-recovery grace period. During
the grace period, don't update the current cluster membership so that
the inconsistent membership will not be used to cancel queries on
coordinators and executors.
This patch handles inconsistent cluster membership state after
statestore failover in the same way.

Testing:
 - Added a new test case to verify that inconsistent cluster
   membership after statestore failover will not result in query
   cancellation.
 - Fixed closing client issue for Catalogd HA test case
   test_catalogd_failover_with_sync_ddl when the test fails.
 - Passed core test.

Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
---
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M tests/custom_cluster/test_catalogd_ha.py
M tests/custom_cluster/test_statestored_ha.py
5 files changed, 165 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/21520/10
--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16372/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 17:14:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 9: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21520/9/tests/custom_cluster/test_statestored_ha.py
File tests/custom_cluster/test_statestored_ha.py:

http://gerrit.cloudera.org:8080/#/c/21520/9/tests/custom_cluster/test_statestored_ha.py@704
PS9, Line 704: sleep(1)
nit: why sleep(1) when wait_until_ready=True above?


http://gerrit.cloudera.org:8080/#/c/21520/9/tests/custom_cluster/test_statestored_ha.py@705
PS9, Line 705:   assert(not 
statestore_service_0.get_metric_value("statestore.active-status"))
 :   
assert(statestore_service_1.get_metric_value("statestore.active-status"))
nit: These assertion and others about active-status are not really 
straightforward to understand. Adding human-readable error message will be 
helpful.

  assert(not 
statestore_service_0.get_metric_value("statestore.active-status")), 'First 
statestored must not be active'
  
assert(statestore_service_1.get_metric_value("statestore.active-status")), 
'Second statestored must be active'



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 17:14:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12871: Added filtering capability for Calcite planner

2024-06-17 Thread Steve Carlin (Code Review)
Steve Carlin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21498 )

Change subject: IMPALA-12871: Added filtering capability for Calcite planner
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21498/1/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaFilterRel.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaFilterRel.java:

http://gerrit.cloudera.org:8080/#/c/21498/1/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaFilterRel.java@79
PS1, Line 79: // need to set the inputRefs.  The HdfsScan RelNode needs to 
know which columns are
: // needed from the table in order to implement the filter 
condition. The input ref
: // used here may nor may not be projected out. So a union 
needs to be done with
: // potential existing projected input refs from a parent 
RelNode.
: // Note that if the parent RelNode hasn't set any input refs, 
it is assumed that all
: // input refs are needed (the default case when inputRefs_ is 
null).
> So, I'm assuming this would be something like:
Correct


http://gerrit.cloudera.org:8080/#/c/21498/1/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaFilterRel.java@98
PS1, Line 98: List conditions = 
ImmutableList.of(previousCondition, newCondition);
: return 
RexUtil.composeConjunction(getCluster().getRexBuilder(), conditions);
> For my own understanding: I assume this is about having multiple layers of
Unfortunately, this will not work right now. This works in a future commit.

The reason it won't work is because the "and" and "or" clause cannot go through 
the FunctionCallExpr object.  They need to use the special CompoundExpr which 
will be pushed in a later commit.


http://gerrit.cloudera.org:8080/#/c/21498/1/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/util/ExprConjunctsConverter.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/util/ExprConjunctsConverter.java:

http://gerrit.cloudera.org:8080/#/c/21498/1/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/util/ExprConjunctsConverter.java@87
PS1, Line 87: // If it's not a RexCall, there's no AND operator and we can
: // just return the conjunct.
: if (!(conjunct instanceof RexCall)) {
:   return ImmutableList.of(conjunct);
: }
> For my own curiosity, when would we hit this case? Would it be something li
I'm not sure if it's necessary for this commit, but this would hold true for 
literals.



--
To view, visit http://gerrit.cloudera.org:8080/21498
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If104bf1cd801d5ee92dd7e43d398a21a18be5d97
Gerrit-Change-Number: 21498
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Mon, 17 Jun 2024 17:09:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21452 )

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16371/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Jun 2024 17:02:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

2024-06-17 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21492 )

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
..


Patch Set 8: Code-Review+1

(11 comments)

http://gerrit.cloudera.org:8080/#/c/21492/6/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/21492/6/common/thrift/ImpalaService.thrift@883
PS6, Line 883: conversion
> Done
Done


http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/util/ExprUtil.java
File fe/src/main/java/org/apache/impala/util/ExprUtil.java:

http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/util/ExprUtil.java@104
PS6, Line 104: Preconditions.checkArgument(timestampExpr.isConstant());
> Those are added here: https://github.com/apache/impala/blob/5d1bd80623324f8
Ack


http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@753
PS6, Line 753: kudu-dml-with-utc-conversion
> The only case where CONVERT_KUDU_UTC_TIMESTAMPS affects the plan are bloom
Ack


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/datasets/functional/schema_constraints.csv
File testdata/datasets/functional/schema_constraints.csv:

http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/datasets/functional/schema_constraints.csv@425
PS6, Line 425: # The table is used to test DST changes in timestamps, the 
timestamps in the table near
 : # DST changes in the 'America/Los_Angeles' time zone.
> Done
Done


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test:

http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test@75
PS6, Line 75:
> Is this whitespace a defect from getExplainString() ?
Have you look into this extra whitespace?


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test:

http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@66
PS6, Line 66: select * from utc_kudu where id > 10;
> Changed the results in other tests to be always ordered by id. I can also a
Just ordering in test file is fine. Than you.


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@79
PS6, Line 79: select * from utc_kudu where id > 10;
> Add "order by id" for readability?
Done


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@129
PS6, Line 129: select * from utc_kudu where ts_pk_col = ts_c
> Done
Done


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@138
PS6, Line 138: 2011-11-06 01:20:06,2011-11-06 01:20:
> Add "order by id" for readability?
Done


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test@149
PS6, Line 149: TIMESTAMP,TIMESTAMP,INT
> Add "order by id" for readability?
Done


http://gerrit.cloudera.org:8080/#/c/21492/6/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

http://gerrit.cloudera.org:8080/#/c/21492/6/tests/query_test/test_kudu.py@108
PS6, Line 108: 
cls.ImpalaTestMatrix.add_mandatory_exec_option('write_kudu_utc_timestamps', 
'true')
> Done
Done



--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Mon, 17 Jun 2024 16:56:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21520/8/tests/custom_cluster/test_statestored_ha.py
File tests/custom_cluster/test_statestored_ha.py:

http://gerrit.cloudera.org:8080/#/c/21520/8/tests/custom_cluster/test_statestored_ha.py@727
PS8, Line 727: P
> flake8: W504 line break after binary operator
Done



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 16:52:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..

IMPALA-13159: Fix query cancellation caused by statestore failover

A momentary inconsistent cluster membership state after statestore
failover results in query cancellation.
We already have code to handle inconsistent cluster membership after
statestore restarting by defining a post-recovery grace period. During
the grace period, don't update the current cluster membership so that
the inconsistent membership will not be used to cancel queries on
coordinators and executors.
This patch handles inconsistent cluster membership state after
statestore failover in the same way.

Testing:
 - Added a new test case to verify that inconsistent cluster
   membership after statestore failover will not result in query
   cancellation.
 - Fixed closing client issue for Catalogd HA test case
   test_catalogd_failover_with_sync_ddl when the test fails.
 - Passed core test.

Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
---
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M tests/custom_cluster/test_catalogd_ha.py
M tests/custom_cluster/test_statestored_ha.py
5 files changed, 157 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/21520/9
--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21492 )

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16370/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Mon, 17 Jun 2024 16:50:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

2024-06-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21492 )

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
..


Patch Set 3:

(14 comments)

Thanks for the feedback!

http://gerrit.cloudera.org:8080/#/c/21492/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21492/3//COMMIT_MSG@16
PS3, Line 16: To be able to read back Kudu tables written by Impala correctly
: convert_kudu_utc_timestamps and write_kudu_utc_timestamps need to
: have the same value.
> Ack.
I would prefer to think through the query options later and assume for now the 
user knows exactly what they are doing if using these options.


http://gerrit.cloudera.org:8080/#/c/21492/6/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/21492/6/common/thrift/ImpalaService.thrift@883
PS6, Line 883: conversiyon
> nit: conversion
Done


http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@853
PS6, Line 853: boolean convert_to_utc =
> nit: could be camel case
Done


http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@906
PS6, Line 906: Expr expr = tmpPartitionKeyExprs.get(j);
> I think this conversion should take place in the planner instead: during th
As we have discussed on chat, I couldn't find any convenient place to do this 
(though I agree that the current solution is not great).


http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java
File fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java:

http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java@98
PS6, Line 98: for (Pair valueAssignment : 
modifyStmt_.assignments_) {
> nit: could be camel case
Done


http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/util/ExprUtil.java
File fe/src/main/java/org/apache/impala/util/ExprUtil.java:

http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/util/ExprUtil.java@104
PS6, Line 104: Preconditions.checkArgument(timestampExpr.isConstant());
> Is this Precondition and another one in L86 correct? I see some examples wh
Those are added here: 
https://github.com/apache/impala/blob/5d1bd80623324f829aca604b25d97ace21f51417/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java#L493


http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@753
PS6, Line 753: evel(TExplainLevel.EXTENDED)
> Do you mind adding 1 SELECT test where CONVERT_KUDU_UTC_TIMESTAMPS affect p
The only case where CONVERT_KUDU_UTC_TIMESTAMPS affects the plan are bloom 
filters in runtime filters. There are already tests for that that check this in 
the profile:
https://gerrit.cloudera.org/#/c/20681/22/testdata/workloads/functional-query/queries/QueryTest/kudu_runtime_filter_with_timestamp_conversion.test


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/datasets/functional/schema_constraints.csv
File testdata/datasets/functional/schema_constraints.csv:

http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/datasets/functional/schema_constraints.csv@425
PS6, Line 425: # DST changes in the 'America/Los_Angeles' time zone.
 : table_name:timestamp_at_dst_changes, constraint:restr
> Please add in comment:
Done


http://gerrit.cloudera.org:8080/#/c/21492/6/testdata/datasets/functional/schema_constraints.csv@425
PS6, Line 425: # DST changes in the 'America/Los_Angeles' time zone.
 : table_name:timestamp_at_dst_changes, constraint:restr
> It is also good note to mention that during DST, PDT = UTC-7. While outside
Done


http://gerrit.cloudera.org:8080/#/c/21492/5/testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test:

http://gerrit.cloudera.org:8080/#/c/21492/5/testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test@20
PS5, Line 20: INT
> question: This was BIGINT before and it does not break test when timestamp_
These tests were not verified before the change. This is related to the changes 
in https://gerrit.cloudera.org/#/c/21492/5/tests/common/test_result_verifier.py
The last paragraph of the commit message also mentions this.



[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21452 )

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10726/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Jun 2024 16:39:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-06-17 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21452 )

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..


Patch Set 6:

(6 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc
File be/src/exec/iceberg-delete-builder.cc:

http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@266
PS5, Line 266: int probes_waiting = NumProbesWaiting();
 :
 :   // Let's be conservative and only use 2/3 of available theads, 
because the scanner
 :   // threads are not as CPU heavy as the sort threads.
 :   r
> Do you think it would be worth extracting this to a member function in Join
Yeah, I think it's nice to extract it in a function. I didn't make 
'probes_waiting_on_builder_' private though, as currently it is in a nice /// 
BEGIN /// END section of members that I did not want to mess up.


http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@275
PS5, Line 275:   for (auto& [path, intermediate_vec] : 
intermediate_delete_rows_) {
> Nit: we usually don't match the indentation of arguments but use 4 spaces f
For formulas I prefer readability over the 4 spaces rule.


http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@279
PS5, Line 279:
> Using the actual type would be more readable. Applies also to L322 and L324
Actual type is a pair, I switched to structured binding.


http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@296
PS5, Line 296: TURN_IF_ERROR(pool.Init());
 :   for (auto& [path, intermediate_vec] : 
intermediate_delete_rows_) {
 : BuildWorkItem work_item;
 : work_item.path = 
> This is the same as the loop body of FinalizeBuildImpl() with these excepti
Thanks for the suggestions.

About 1:
I'd have to do something like:

  if (SHOULD_LOCK) separate_build_lock_.lock()

  deleted_rows_[*path] = std::move(final_delete_vec);

  if (SHOULD_LOCK) separate_build_lock_.unlock();

I don't really like such constructs as we lose RAII.

To keep RAII, we could do:

 std::unique_lock guard(separate_build_lock_, std::defer_lock_t{});
 if (SHOULD_LOCK) {
   guard.lock();
 }
 deleted_rows_[*path] = std::move(final_delete_vec);

But it feels a bit too complex, also we are creating an unnecessary guard 
object when SHOULD_LOCK is false (though the compiler might be smart enough to 
eliminate).

So for now I'd leave it in its current form. I'd also keep BuildWorkItem as it 
makes the code a bit more explicit / readable.


http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@304
PS5, Line 304: medi
> Using the actual type would be more readable.
Switched to structured binding.


http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@319
PS5, Line 319: 
> Because 'intermediate_vec' is a const ref, this will not actually move the
Nice catch!



--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Jun 2024 16:39:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-06-17 Thread Zoltan Borok-Nagy (Code Review)
Hello Daniel Becker, Gabor Kaszab, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21452

to look at the new patch set (#6).

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..

IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

With this patch IcebergDeleteBuilder checks how many probe threads
are actually blocked on the builder. Let's assume the following plan:

 UNION ALL
/ \
   /   \
  / \
 SCAN allANTI JOIN
 datafiles  / \
 without   /   \
 deletes  SCAN SCAN
  datafilesdeletes
  with deletes

In that case UNION ALL, and the two "SCAN datafiles" operators are in
the same fragment, while the builder of the ANTI JOIN is in a different
fragment. This means that "SCAN datafiles without deletes" can run in
parallel with the builder. But once that SCAN is exhausted, the UNION
ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN
operator, but that operator depends on the join builder output.

This means in some cases the SCAN fragments are busy, while in other
cases the SCAN fragments are blocked. It depends on how much work
they need to do, and how much work the build-side needs to do. So to
handle all cases, we dynamically check how many probe fragments are
blocked on the builder, then spin up as many threads to parellelize
the final sort.

This also works well when we have the following plan:

ANTI JOIN
   / \
  /   \
 SCAN SCAN
 datafilesdeletes
 with deletes

The above plan is created when all data files have corresponding
deletes, or when we are running a simple count(*) query. In that
case all "SCAN datafiles" fragments are blocked on the builder,
so we can use that many threads to sort the build results.

A new field "ThreadCountInFinalBuild" was added, so we can check the
query profile about how many threads were used for the final
sorting in the builders.

Measurements:
In a table with 1 Trillion data records and 68.5 Billion delete records
it reduced "IcebergDeletePositionSortTimer" from ~1 minute to
8-10 seconds, in an environment with 40 executors and MT_DOP=12.

Testing:
 * e2e tests that check counter "ThreadCountInFinalBuild"

Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
(cherry picked from commit 74a942d2a38555c39bc51d61a05b1c90712c)
---
M be/src/exec/iceberg-delete-builder.cc
M be/src/exec/iceberg-delete-builder.h
M be/src/exec/join-builder.cc
M be/src/exec/join-builder.h
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-update-stress.test
M tests/stress/test_update_stress.py
6 files changed, 160 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/6
--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21492 )

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16369/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Mon, 17 Jun 2024 16:33:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

2024-06-17 Thread Csaba Ringhofer (Code Review)
Hello Alexey Serbin, Riza Suminto, Ashwani Raina, Zihao Ye, Peter Rozsa, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21492

to look at the new patch set (#8).

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
..

IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

Before this commit, only read support was implemented
(convert_kudu_utc_timestamps=true). This change adds write support:
if write_kudu_utc_timestamps=true, then timestamps are converted
from local time to UTC during DMLs to Kudu. In case of
ambigious conversions (DST changes) the earlier possible UTC
timestamp is written.

All DMLs supported with Kudu tables are affected affected:
INSERT, UPSERT, UPDATE, DELETE

To be able to read back Kudu tables written by Impala correctly
convert_kudu_utc_timestamps and write_kudu_utc_timestamps need to
have the same value. Having the same value in the two query option
is also critical for UPDATE/DELETE if the primary key contains a
timestamp column - these operations do a scan first (affected by
convert_kudu_utc_timestamps) and then use the keys from the scan to
select updated/deleted rows (affected by write_kudu_utc_timestamps).

The conversion is implemented by adding to_utc_timestamp() to inserted
timestamp expressions during planning. This allows doing the same
conversion during the pre-insert sorting and partitioning.
Read support is implemented differently - in that case the plan is not
changed and the scanner does the conversion.

Other changes:
- Before this change, verification of tests with TIMESTAMP results
  were skipped when the file format is Kudu. This shouldn't be
  necessary so the skipping was removed.

Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java
M fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java
M fe/src/main/java/org/apache/impala/util/ExprUtil.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test
M 
testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test
M 
testdata/workloads/functional-query/queries/QueryTest/kudu_runtime_filter_with_timestamp_conversion.test
M 
testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test
M tests/common/test_result_verifier.py
M tests/query_test/test_kudu.py
17 files changed, 394 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/21492/8
--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zihao Ye 


[Impala-ASF-CR] IMPALA-13137: Add additional client fetch metrics columns to the queries page

2024-06-17 Thread Kurt Deschler (Code Review)
Kurt Deschler has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21482 )

Change subject: IMPALA-13137: Add additional client fetch metrics columns to 
the queries page
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21482/6/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

http://gerrit.cloudera.org:8080/#/c/21482/6/be/src/service/impala-http-handler.cc@463
PS6, Line 463:   document->AddMember("tips_fetch_duration", "The time taken for 
the client to fetch all"
Total time spent returning rows to the client and other client-side processing.


http://gerrit.cloudera.org:8080/#/c/21482/6/be/src/service/query-state-record.h
File be/src/service/query-state-record.h:

http://gerrit.cloudera.org:8080/#/c/21482/6/be/src/service/query-state-record.h@83
PS6, Line 83:   /// The time taken for the client to fetch all rows.
Total time spent returning rows to the client and other client-side processing.


http://gerrit.cloudera.org:8080/#/c/21482/6/www/queries.tmpl
File www/queries.tmpl:

http://gerrit.cloudera.org:8080/#/c/21482/6/www/queries.tmpl@119
PS6, Line 119:   FetchDuration
Client Fetch Duration



--
To view, visit http://gerrit.cloudera.org:8080/21482
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74a9393a7b38750de0c3f6230b6e5e048048c4b5
Gerrit-Change-Number: 21482
Gerrit-PatchSet: 6
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 16:13:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

2024-06-17 Thread Csaba Ringhofer (Code Review)
Hello Alexey Serbin, Riza Suminto, Ashwani Raina, Zihao Ye, Peter Rozsa, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21492

to look at the new patch set (#7).

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
..

IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

Before this commit, only read support was implemented
(convert_kudu_utc_timestamps=true). This change adds write support:
if write_kudu_utc_timestamps=true, then timestamps are converted
from local time to UTC during DMLs to Kudu. In case of
ambigious conversions (DST changes) the earlier possible UTC
timestamp is written.

All DMLs supported with Kudu tables are affected affected:
INSERT, UPSERT, UPDATE, DELETE

To be able to read back Kudu tables written by Impala correctly
convert_kudu_utc_timestamps and write_kudu_utc_timestamps need to
have the same value. Having the same value in the two query option
is also critical for UPDATE/DELETE if the primary key contains a
timestamp column - these operations do a scan first (affected by
convert_kudu_utc_timestamps) and then use the keys from the scan to
select updated/deleted rows (affected by write_kudu_utc_timestamps).

The conversion is implemented by adding to_utc_timestamp() to inserted
timestamp expressions during planning. This allows doing the same
conversion during the pre-insert sorting and partitioning.
Read support is implemented differently - in that case the plan is not
changed and the scanner does the conversion.

Other changes:
- Before this change, verification of tests with TIMESTAMP results
  were skipped when the file format is Kudu. This shouldn't be
  necessary so the skipping was removed.

Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java
M fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java
M fe/src/main/java/org/apache/impala/util/ExprUtil.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-dml-with-utc-conversion.test
M 
testdata/workloads/functional-query/queries/QueryTest/kudu_predicate_with_timestamp_conversion.test
M 
testdata/workloads/functional-query/queries/QueryTest/kudu_runtime_filter_with_timestamp_conversion.test
M 
testdata/workloads/functional-query/queries/QueryTest/kudu_timestamp_conversion.test
M tests/common/test_result_verifier.py
M tests/query_test/test_kudu.py
17 files changed, 385 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/21492/7
--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zihao Ye 


[Impala-ASF-CR] IMPALA-12732: Add support for MERGE statements for Iceberg tables

2024-06-17 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21423 )

Change subject: IMPALA-12732: Add support for MERGE statements for Iceberg 
tables
..


Patch Set 7:

(16 comments)

Another batch. I'm almost done with the Analyzer part. Will take another look 
at MergeInsert and then will move to the Planner.

http://gerrit.cloudera.org:8080/#/c/21423/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21423/6//COMMIT_MSG@24
PS6, Line 24:
> Done
This doesn't seem done


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java
File fe/src/main/java/org/apache/impala/analysis/MergeCase.java:

http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java@28
PS7, Line 28: parent_
> nit: I think naming this mergeStmt_ would be more descriptive than 'parent_
Anyway, I checked the usages for parent_ and apparently we query table names 
and columns from MergeStmt. This for me violates encapsulation for MergeCase 
classes and I'd think it would be a better approach to not pass MergeStmt into 
MergeCase and pass the necessary attributes only.


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java@33
PS7, Line 33:   protected MergeCase() { filterExprs_ = Collections.emptyList(); 
}
With this implementation we rely on the inherited classes to initialize 
resultExprs_. This could be unsafe if someone in the future adds code to this 
base class and assume that similarly to filterExprs_, resultExprs_ is also 
non-null.
I'd rather initialize as an empty list here. Maybe wa can do precondition check 
before the usages to verify this is not null.


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeInsert.java
File fe/src/main/java/org/apache/impala/analysis/MergeInsert.java:

http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeInsert.java@33
PS7, Line 33: public class MergeInsert extends MergeCase {
comments pls, with examples for better understanding


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeInsert.java@35
PS7, Line 35:   /**
nit: I think regular, single line comment format is fine for member variables.


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeInsert.java@49
PS7, Line 49:
nit: extra line


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeInsert.java@65
PS7, Line 65: for (Expr expr : resultExprs_) {
nit: fits into single line


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeInsert.java@82
PS7, Line 82: return new MergeInsert(parent_, resultExprs_, 
getFilterExprs());
This clone doesn't actually clone the internal state of this class just for the 
superclass. I find this misleading because even though we do a clone we 
actually lose internal information. At this point of time you know where the 
clone() fn is actually invoked and you know that some of the members won't be 
needed from that point on, but it is not guaranteed that in the future cloning 
won't be called elsewhere. I think this isn't a futureproof approach.

I'd prefer actual clone where we populate everything.

Also, see other clone() implementations in MergeCase subclasses.


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeInsert.java@102
PS7, Line 102: Set duplicateFilteredColumnPermutation = new 
LinkedHashSet<>(
I think duplicate check could be achieved easier than this. You just create an 
empty HashSet, start iterating through columnPermutation_, keep adding the col 
names to the HashSet and once you find a duplicate you throw an exception. I 
think it's enough to return an error with the first duplicate found, no need to 
gather all of them.
Not to mention that here you do many iterations on these columns by 
pre-deduplicating, then copying, then removing. This could be achieved in one 
iteration even if you want to collect all the dups.


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeInsert.java@127
PS7, Line 127: for (int i = 0; i < columns.size(); ++i) {
Is the index 'i' used anywhere? Can this be a foreach on 'columns'?


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeInsert.java@225
PS7, Line 225:   public static class ColumnMatch {
This inner class is for internal use, right? Can be private/protected then. 
Same for ColumnMatches.


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeInsert.java@227
PS7, Line 227: private final Column targetTableColumn;
nit: '_' suffix



[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-17 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21435 )

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..


Patch Set 6: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Jun 2024 15:32:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21435 )

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16368/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Jun 2024 15:12:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-17 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21435 )

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..


Patch Set 6:

(3 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/21435/5/be/src/exec/iceberg-delete-builder.cc
File be/src/exec/iceberg-delete-builder.cc:

http://gerrit.cloudera.org:8080/#/c/21435/5/be/src/exec/iceberg-delete-builder.cc@305
PS5, Line 305: vect
> I think using the actual type (vector&) is more readable.
Done


http://gerrit.cloudera.org:8080/#/c/21435/5/be/src/exec/iceberg-delete-builder.cc@327
PS5, Line 327: join
> "this IcebergDeleteBuilder"?
It was intentional from me to write 'join', as the associated JOIN fragment 
processes the data files, and in the builder's context it should be clear which 
'join' we are referring to.


http://gerrit.cloudera.org:8080/#/c/21435/5/be/src/exec/iceberg-delete-builder.cc@328
PS5, Line 328: process
> Nit: "processes".
Done



--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Jun 2024 14:47:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-17 Thread Zoltan Borok-Nagy (Code Review)
Hello Daniel Becker, Gabor Kaszab, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21435

to look at the new patch set (#6).

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..

IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

When there are lots of delete records the IcebergDeleteBuilder can
become a bottleneck. Since the left side of the JOIN is blocked on
the build side any improvement we make here significantly improves
Iceberg V2 table scanning.

Improvements of this patch:

* Use a vector of vectors to collect the position delete records.
  This way we can avoid large re-allocations and copyings.
* Insert large ranges from the build batches into the collected
  delete records instead of doing it one-by-one.

Measurements

Local measurement with 824 Million position delete records:
JOIN BUILD: ~32s -> ~14s (6s is the final sorting)

40-node cluster with 68.5 Billion position delete records:
JOIN BUILD: 4m15s -> 1m45s (1m7s is the final sorting)

Parallelization of the final sort will be added in a follow-up CR.

Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
---
M be/src/exec/iceberg-delete-builder.cc
M be/src/exec/iceberg-delete-builder.h
2 files changed, 113 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/21435/6
--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16367/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 8
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 14:49:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-17 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21435 )

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..


Patch Set 5:

(4 comments)

Thanks Zoltán!

http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.cc
File be/src/exec/iceberg-delete-builder.cc:

http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.cc@242
PS3, Line 242: auto
> The actual type is a pair.
Yes, it's also a pity that structured bindings don't allow type annotation.


http://gerrit.cloudera.org:8080/#/c/21435/5/be/src/exec/iceberg-delete-builder.cc
File be/src/exec/iceberg-delete-builder.cc:

http://gerrit.cloudera.org:8080/#/c/21435/5/be/src/exec/iceberg-delete-builder.cc@305
PS5, Line 305: auto
I think using the actual type (vector&) is more readable.


http://gerrit.cloudera.org:8080/#/c/21435/5/be/src/exec/iceberg-delete-builder.cc@327
PS5, Line 327: join
"this IcebergDeleteBuilder"?


http://gerrit.cloudera.org:8080/#/c/21435/5/be/src/exec/iceberg-delete-builder.cc@328
PS5, Line 328: process
Nit: "processes".



--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Jun 2024 14:41:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom()

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21501 )

Change subject: IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom()
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10725/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21501
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6a1d03d65ec4339a0f33e69ff29abdd8cc3e3067
Gerrit-Change-Number: 21501
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Mon, 17 Jun 2024 14:28:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom()

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21501 )

Change subject: IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom()
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21501
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6a1d03d65ec4339a0f33e69ff29abdd8cc3e3067
Gerrit-Change-Number: 21501
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Mon, 17 Jun 2024 14:28:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-06-17 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21452 )

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..


Patch Set 5:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.h
File be/src/exec/iceberg-delete-builder.h:

http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.h@148
PS5, Line 148: Impl
To unambiguously differentiate it from FinalizeBuildImplParallel(), this could 
be called FinalizeBuildImplSequential().


http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc
File be/src/exec/iceberg-delete-builder.cc:

http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@266
PS5, Line 266: int probes_waiting;
 :   {
 : unique_lock l(separate_build_lock_);
 : probes_waiting = probes_waiting_on_builder_;
 :   }
Do you think it would be worth extracting this to a member function in 
JoinBuilder? Then 'probes_waiting_on_builder_' could also be private.


http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@275
PS5, Line 275:  min(data_file_count, probes_waiting) * 2 / 3);
Nit: we usually don't match the indentation of arguments but use 4 spaces for 
continuation indentation.


http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@279
PS5, Line 279: auto
Using the actual type would be more readable. Applies also to L322 and L324.


http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@296
PS5, Line 296: DeleteRowVector final_delete_vec = 
GetFinalDeleteRowVector(intermediate_vec);
 : intermediate_vec.clear();
 : unique_lock l(separate_build_lock_);
 : deleted_rows_[*path] = std::move(final_delete_vec);
This is the same as the loop body of FinalizeBuildImpl() with these exceptions:
1. there is locking here
2. this function uses BuildWorkItem instead of std::pair.

These could be unified in the same function so there would be no need for this 
lambda:
1. a boolean template parameter would decide  whether we need to lock and the 
function could use "constexpr if"
2. the ThreadPool could also use std::pair 
as its template parameter, so BuildWorkItem would no longer be needed.


http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@304
PS5, Line 304: auto
Using the actual type would be more readable.


http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@319
PS5, Line 319: move
Because 'intermediate_vec' is a const ref, this will not actually move the 
vector but copy it, see
https://stackoverflow.com/questions/28595117/why-can-we-use-stdmove-on-a-const-object

So I think 'intermediate_vec' should not be const and then 
'intermediate_vec.clear()' could be brought here from L297 and L282.



--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Jun 2024 14:26:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21520/8/tests/custom_cluster/test_statestored_ha.py
File tests/custom_cluster/test_statestored_ha.py:

http://gerrit.cloudera.org:8080/#/c/21520/8/tests/custom_cluster/test_statestored_ha.py@727
PS8, Line 727: +
flake8: W504 line break after binary operator



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 8
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 17 Jun 2024 14:26:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..

IMPALA-13159: Fix query cancellation caused by statestore failover

A momentary inconsistent cluster membership state after statestore
failover results in query cancellation.
We already have code to handle inconsistent cluster membership after
statestore restarting by defining a post-recovery grace period. During
the grace period, don't update the current cluster membership so that
the inconsistent membership will not be used to cancel queries on
coordinators and executors.
This patch handles inconsistent cluster membership state after
statestore failover in the same way.

Testing:
 - Added a new test case to verify that inconsistent cluster
   membership after statestore failover will not result in query
   cancellation.
 - Fixed closing client issue for Catalogd HA test case
   test_catalogd_failover_with_sync_ddl when the test fails.
 - Passed core test.

Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
---
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M tests/custom_cluster/test_catalogd_ha.py
M tests/custom_cluster/test_statestored_ha.py
5 files changed, 157 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/21520/8
--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 8
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21435 )

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16366/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Jun 2024 14:11:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-17 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21435 )

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..


Patch Set 5:

(9 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.h
File be/src/exec/iceberg-delete-builder.h:

http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.h@70
PS3, Line 70: s
> Nit: this should be "Processes".
Done


http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.h@121
PS3, Line 121: ctor of vectors means we can just allocate another vector to hold
 :   // subsequent eleme
> Could you clarify this? Is it about 'intermediate_delete_rows_' in addition
Rephrased a bit, I hope it's clearer now.


http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.h@149
PS3, Line 149: s
> We usually take non-const (output) arguments by pointer.
Done, also switched the parameters, as we prefer to have output parameters at 
the end.


http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.cc
File be/src/exec/iceberg-delete-builder.cc:

http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.cc@184
PS3, Line 184: Inte
> I think using the actual type (IntermediateDeleteRowVector&) is more readab
Done


http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.cc@242
PS3, Line 242: auto
> I think using the actual type (IntermediateDeleteRowVector&) is more readab
The actual type is a pair.

I switched to structured binding as I think with proper variable names it is 
more readable.


http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.cc@250
PS3, Line 250: e_ve
> I think using the actual type (vector) is more readable. Also L252
Done


http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.cc@296
PS3, Line 296: cons
> I think using the actual type (vector) is more readable. Also L306
Done


http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.cc@319
PS3, Line 319:
> Can we take it by const ref?
Done


http://gerrit.cloudera.org:8080/#/c/21435/3/be/src/exec/iceberg-delete-builder.cc@327
PS3, Line 327:
> Can it happen that (*path) is not a key in 'intermediate_delete_rows_' but
Done



--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Jun 2024 13:47:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-06-17 Thread Zoltan Borok-Nagy (Code Review)
Hello Daniel Becker, Gabor Kaszab, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21435

to look at the new patch set (#5).

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..

IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

When there are lots of delete records the IcebergDeleteBuilder can
become a bottleneck. Since the left side of the JOIN is blocked on
the build side any improvement we make here significantly improves
Iceberg V2 table scanning.

Improvements of this patch:

* Use a vector of vectors to collect the position delete records.
  This way we can avoid large re-allocations and copyings.
* Insert large ranges from the build batches into the collected
  delete records instead of doing it one-by-one.

Measurements

Local measurement with 824 Million position delete records:
JOIN BUILD: ~32s -> ~14s (6s is the final sorting)

40-node cluster with 68.5 Billion position delete records:
JOIN BUILD: 4m15s -> 1m45s (1m7s is the final sorting)

Parallelization of the final sort will be added in a follow-up CR.

Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
---
M be/src/exec/iceberg-delete-builder.cc
M be/src/exec/iceberg-delete-builder.h
2 files changed, 113 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/21435/5
--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom()

2024-06-17 Thread Peter Rozsa (Code Review)
Peter Rozsa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21501 )

Change subject: IMPALA-13150: Possible buffer overflow in StringVal::CopyFrom()
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21501
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6a1d03d65ec4339a0f33e69ff29abdd8cc3e3067
Gerrit-Change-Number: 21501
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Mon, 17 Jun 2024 13:30:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-17 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21426/6/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/21426/6/shell/impala_client.py@1623
PS6, Line 1623:   print(msg + " at: %s" % (time.strftime("%Y-%m-%d %H:%M:%S", 
time.localtime())),
Can we put the timestamp at the beginning? So the format is more like a log and 
we can wrap the code of

log_timestamp("Warning")
print("Warning: close session RPC failed: {0}, {1}".format(str(e), 
type(e)))

into

  log_timestamp("Warning: close session RPC failed: {0}, {1}".format(str(e), 
type(e)))



--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 6
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 
Gerrit-Comment-Date: Mon, 17 Jun 2024 11:25:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16365/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 6
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 
Gerrit-Comment-Date: Mon, 17 Jun 2024 08:07:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16364/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 4
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 
Gerrit-Comment-Date: Mon, 17 Jun 2024 07:50:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-17 Thread Saurabh Katiyal (Code Review)
Saurabh Katiyal has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..

IMPALA-12216: Print timestamp for impala-shell errors

This change will print timestamp of an exception or warning
occurred during execution of a query via impala-shell.

Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
---
M shell/impala_client.py
M shell/impala_shell.py
2 files changed, 27 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/21426/6
--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 6
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-17 Thread Saurabh Katiyal (Code Review)
Saurabh Katiyal has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..

IMPALA-12216: Print timestamp for impala-shell errors

This change will print timestamp of an exception or warning
occurred during execution of a query via impala-shell.

Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
---
M shell/impala_client.py
M shell/impala_shell.py
2 files changed, 25 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/21426/4
--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 4
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 


[Impala-ASF-CR] IMPALA-12216: Print timestamp for impala-shell errors

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21426 )

Change subject: IMPALA-12216: Print timestamp for impala-shell errors
..


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/21426/4/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/21426/4/shell/impala_client.py@1615
PS4, Line 1615: def log_exception_with_timestamp(e,msg="Exception"):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/21426/4/shell/impala_client.py@1615
PS4, Line 1615: ,
flake8: E231 missing whitespace after ','


http://gerrit.cloudera.org:8080/#/c/21426/4/shell/impala_client.py@1620
PS4, Line 1620: def log_timestamp(msg="Exception"):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/21426/4/shell/impala_client.py@1622
PS4, Line 1622:
flake8: W292 no newline at end of file


http://gerrit.cloudera.org:8080/#/c/21426/4/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/21426/4/shell/impala_shell.py@1461
PS4, Line 1461: ,
flake8: E231 missing whitespace after ','


http://gerrit.cloudera.org:8080/#/c/21426/4/shell/impala_shell.py@1473
PS4, Line 1473: ,
flake8: E231 missing whitespace after ','


http://gerrit.cloudera.org:8080/#/c/21426/4/shell/impala_shell.py@1505
PS4, Line 1505: ,
flake8: E231 missing whitespace after ','



--
To view, visit http://gerrit.cloudera.org:8080/21426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4abbd02aa9f61210b0333495bf191e72c22a5944
Gerrit-Change-Number: 21426
Gerrit-PatchSet: 4
Gerrit-Owner: Saurabh Katiyal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Saurabh Katiyal 
Gerrit-Comment-Date: Mon, 17 Jun 2024 07:26:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu

2024-06-17 Thread Peter Rozsa (Code Review)
Peter Rozsa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21492 )

Change subject: IMPALA-12370: Allow converting timestamps to UTC when writing 
to Kudu
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@853
PS6, Line 853: boolean convert_to_utc =
nit: could be camel case


http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@906
PS6, Line 906: Expr expr = tmpPartitionKeyExprs.get(j);
I think this conversion should take place in the planner instead: during the 
creation of KuduSinkNode, the output substitution map could be modified to 
contain this replacement code.


http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java
File fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java:

http://gerrit.cloudera.org:8080/#/c/21492/6/fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java@98
PS6, Line 98: boolean convert_to_utc = 
analyzer.getQueryOptions().isWrite_kudu_utc_timestamps();
nit: could be camel case



--
To view, visit http://gerrit.cloudera.org:8080/21492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1
Gerrit-Change-Number: 21492
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Mon, 17 Jun 2024 07:04:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13120: Load failed table without need for manual invalidate

2024-06-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21478 )

Change subject: IMPALA-13120: Load failed table without need for manual 
invalidate
..


Patch Set 3:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10724/


--
To view, visit http://gerrit.cloudera.org:8080/21478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia882fdd865ef716351be7f1eaf203a9fb04c1c15
Gerrit-Change-Number: 21478
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 17 Jun 2024 06:59:59 +
Gerrit-HasComments: No