[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has removed a vote on this change.

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Removed Verified-1 by Impala Public Jenkins 
--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10723/


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 16 Jun 2024 03:48:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10723/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 16 Jun 2024 03:48:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16361/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 16 Jun 2024 03:20:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16360/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 16 Jun 2024 03:10:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21520/5/tests/custom_cluster/test_statestored_ha.py
File tests/custom_cluster/test_statestored_ha.py:

http://gerrit.cloudera.org:8080/#/c/21520/5/tests/custom_cluster/test_statestored_ha.py@727
PS5, Line 727:
> flake8: E502 the backslash is redundant between brackets
Done



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 16 Jun 2024 02:58:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..

IMPALA-13159: Fix query cancellation caused by statestore failover

A momentary inconsistent cluster membership state after statestore
failover results in query cancellation.
We already have code to handle inconsistent cluster membership after
statestore restarting by defining a post-recovery grace period. During
the grace period, don't update the current cluster membership so that
the inconsistent membership will not be used to cancel queries on
coordinators and executors.
This patch handles inconsistent cluster membership state after
statestore failover in the same way.

Testing:
 - Added a new test case to verify that inconsistent cluster
   membership after statestore failover will not result in query
   cancellation.
 - Fixed closing client issue for Catalogd HA test case
   test_catalogd_failover_with_sync_ddl when the test fails.
 - Passed core test.

Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
---
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M tests/custom_cluster/test_catalogd_ha.py
M tests/custom_cluster/test_statestored_ha.py
5 files changed, 156 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/21520/6
--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21520/6/tests/custom_cluster/test_statestored_ha.py
File tests/custom_cluster/test_statestored_ha.py:

http://gerrit.cloudera.org:8080/#/c/21520/6/tests/custom_cluster/test_statestored_ha.py@727
PS6, Line 727: +
flake8: W504 line break after binary operator



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 16 Jun 2024 02:59:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..

IMPALA-13159: Fix query cancellation caused by statestore failover

A momentary inconsistent cluster membership state after statestore
failover results in query cancellation.
We already have code to handle inconsistent cluster membership after
statestore restarting by defining a post-recovery grace period. During
the grace period, don't update the current cluster membership so that
the inconsistent membership will not be used to cancel queries on
coordinators and executors.
This patch handles inconsistent cluster membership state after
statestore failover in the same way.

Testing:
 - Added a new test case to verify that inconsistent cluster
   membership after statestore failover will not result in query
   cancellation.
 - Fixed closing client issue for Catalogd HA test case
   test_catalogd_failover_with_sync_ddl when the test fails.
 - Passed core test.

Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
---
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M tests/custom_cluster/test_catalogd_ha.py
M tests/custom_cluster/test_statestored_ha.py
5 files changed, 156 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/21520/5
--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21520/5/tests/custom_cluster/test_statestored_ha.py
File tests/custom_cluster/test_statestored_ha.py:

http://gerrit.cloudera.org:8080/#/c/21520/5/tests/custom_cluster/test_statestored_ha.py@727
PS5, Line 727: \
flake8: E502 the backslash is redundant between brackets



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 16 Jun 2024 02:47:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16359/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 16 Jun 2024 01:43:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..

IMPALA-13159: Fix query cancellation caused by statestore failover

A momentary inconsistent cluster membership state after statestore
failover results in query cancellation.
We already have code to handle inconsistent cluster membership after
statestore restarting by defining a post-recovery grace period. During
the grace period, don't update the current cluster membership so that
the inconsistent membership will not be used to cancel queries on
coordinators and executors.
This patch handles inconsistent cluster membership state after
statestore failover in the same way.

Testing:
 - Added a new test case to verify that inconsistent cluster
   membership after statestore failover will not result in query
   cancellation.
 - Fixed closing client issue for Catalogd HA test case
   test_catalogd_failover_with_sync_ddl when the test fails.
 - Passed core test.

Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
---
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M tests/custom_cluster/test_catalogd_ha.py
M tests/custom_cluster/test_statestored_ha.py
5 files changed, 156 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/21520/4
--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21520/4/tests/custom_cluster/test_statestored_ha.py
File tests/custom_cluster/test_statestored_ha.py:

http://gerrit.cloudera.org:8080/#/c/21520/4/tests/custom_cluster/test_statestored_ha.py@727
PS4, Line 727: <
flake8: W504 line break after binary operator



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 16 Jun 2024 01:20:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21504 )

Change subject: IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21504
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib49c7ae397dadcb2cb69fde1850d442d33cdf177
Gerrit-Change-Number: 21504
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sat, 15 Jun 2024 23:36:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21504 )

Change subject: IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost
..

IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost

TOP-N cost will turn into NaN if inputCardinality is equal to 0 due to
Math.log(inputCardinality). This patch fix the issue by avoiding
Math.log(0) and replace it with 0 instead.

After this patch, Instantiating BaseProcessingCost with NaN, infinite,
or negative totalCost will throw IllegalArgumentException. In
BaseProcessingCost.getDetails(), "total-cost" is renamed to "raw-cost"
to avoid confusion with "cost-total" in ProcessingCost.getDetails().

Testing:
- Add testcase that run TOP-N query over empty table.
- Compute ProcessingCost in most FE and EE test even when
  COMPUTE_PROCESSING_COST option is not enabled by checking if
  RuntimeEnv.INSTANCE.isTestEnv() is True or TEST_REPLAN option is
  enabled.
- Pass core test.

Change-Id: Ib49c7ae397dadcb2cb69fde1850d442d33cdf177
Reviewed-on: http://gerrit.cloudera.org:8080/21504
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/BaseProcessingCost.java
M fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java
M fe/src/main/java/org/apache/impala/planner/CostingSegment.java
M fe/src/main/java/org/apache/impala/planner/DataSink.java
M fe/src/main/java/org/apache/impala/planner/DataStreamSink.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ProcessingCost.java
M fe/src/main/java/org/apache/impala/planner/ScaledProcessingCost.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SumProcessingCost.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q43-verbose.test
30 files changed, 218 insertions(+), 76 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21504
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib49c7ae397dadcb2cb69fde1850d442d33cdf177
Gerrit-Change-Number: 21504
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-13120: Load failed table without need for manual invalidate

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21478 )

Change subject: IMPALA-13120: Load failed table without need for manual 
invalidate
..


Patch Set 3:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10721/


--
To view, visit http://gerrit.cloudera.org:8080/21478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia882fdd865ef716351be7f1eaf203a9fb04c1c15
Gerrit-Change-Number: 21478
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Sat, 15 Jun 2024 21:08:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21504 )

Change subject: IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21504
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib49c7ae397dadcb2cb69fde1850d442d33cdf177
Gerrit-Change-Number: 21504
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sat, 15 Jun 2024 18:32:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21504 )

Change subject: IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10722/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21504
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib49c7ae397dadcb2cb69fde1850d442d33cdf177
Gerrit-Change-Number: 21504
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sat, 15 Jun 2024 18:32:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13159: Fix query cancellation caused by statestore failover

2024-06-15 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21520 )

Change subject: IMPALA-13159: Fix query cancellation caused by statestore 
failover
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21520/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21520/3//COMMIT_MSG@12
PS3, Line 12: During
: the grace period, don't update the current cluster membership so 
that
: the inconsistent membership will not be used to cancel queries on
: coordinators and executors.
Does this also mean that during the grace period, newly added executors won't 
be part of the cluster?



--
To view, visit http://gerrit.cloudera.org:8080/21520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I720bec5199df46475b954558abb0637ca7e6298b
Gerrit-Change-Number: 21520
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sat, 15 Jun 2024 18:06:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost

2024-06-15 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21504 )

Change subject: IMPALA-13152: Avoid NaN, infinite, and negative ProcessingCost
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21504
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib49c7ae397dadcb2cb69fde1850d442d33cdf177
Gerrit-Change-Number: 21504
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sat, 15 Jun 2024 17:44:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13137: Add additional client fetch metrics columns to the queries page

2024-06-15 Thread Surya Hebbar (Code Review)
Surya Hebbar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21482 )

Change subject: IMPALA-13137: Add additional client fetch metrics columns to 
the queries page
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21482/4/www/queries.tmpl
File www/queries.tmpl:

http://gerrit.cloudera.org:8080/#/c/21482/4/www/queries.tmpl@117
PS4, Line 117: First row fetched
 :   
 :   Client fetch 
wait time
> Ok then, "Fetch Duration" and "Queued Duration" is fine.
Okay. I will move correct it with "Duration". Could you please confirm this?



--
To view, visit http://gerrit.cloudera.org:8080/21482
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74a9393a7b38750de0c3f6230b6e5e048048c4b5
Gerrit-Change-Number: 21482
Gerrit-PatchSet: 4
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sat, 15 Jun 2024 17:30:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13120: Load failed table without need for manual invalidate

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21478 )

Change subject: IMPALA-13120: Load failed table without need for manual 
invalidate
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10721/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia882fdd865ef716351be7f1eaf203a9fb04c1c15
Gerrit-Change-Number: 21478
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Sat, 15 Jun 2024 16:00:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21452 )

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10720/


--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Sat, 15 Jun 2024 13:54:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12921, IMPALA-12985: Support running Impala with locally built Ranger

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21160 )

Change subject: IMPALA-12921, IMPALA-12985: Support running Impala with locally 
built Ranger
..

IMPALA-12921, IMPALA-12985: Support running Impala with locally built Ranger

The goals and non-goals of this patch could be summarized as follows.
Goals:
 - Add changes to the minicluster configuration that allow a non-default
   version of Ranger (possibly built locally) to run in the context of
   the minicluster, and to be used as the authorization server by
   Impala.
 - Switch to the new constructor when instantiating
   RangerAccessRequestImpl. This resolves IMPALA-12985 and also makes
   Impala compatible with Apache Ranger if RangerAccessRequestImpl from
   Apache Ranger is consumed.
 - Prepare Ranger and Impala patches as supplemental material to verify
   what authorization-related tests could be passed if Apache Ranger is
   the authorization provider. Merging IMPALA-12921_addendum.diff to
   the Impala repository is not in the scope of this patch in that the
   diff file changes the behavior of Impala and thus more discussion is
   required if we'd like to merge it in the future.

Non-goals:
 - Set up any automation for building Ranger from source.
 - Pass all Impala authorization-related tests with a non-default
   version of Ranger.

Instructions on running Impala with locally built Ranger:

Suppose the Ranger project is under the folder $RANGER_SRC_DIR. We could
execute the following to build Apache Ranger for easy reference. By
default, the compressed tarball is produced under
$RANGER_SRC_DIR/target.

mvn clean compile -B -nsu -DskipCheck=true -Dcheckstyle.skip=true \
package install -DskipITs -DskipTests -Dmaven.javadoc.skip=true

After building Ranger, we need to build Impala's Java code so that
Impala's Java code could consume the locally produced Ranger classes. We
will need to export the following environment variables before building
Impala. This prevents bootstrap_toolchain.py from trying to download the
compressed Ranger tarball.

1. export RANGER_VERSION_OVERRIDE=\
   $(mvn -f $RANGER_SRC_DIR/pom.xml -q help:evaluate \
   -Dexpression=project.version -DforceStdout)

2. export RANGER_HOME_OVERRIDE=$RANGER_SRC_DIR/target/\
   ranger-${RANGER_VERSION_OVERRIDE}-admin

It then suffices to execute the following to point
Impala to the locally built Ranger server before starting Impala.

1. source $IMPALA_HOME/bin/impala-config.sh

2. tar zxv -f $RANGER_SRC_DIR/target/\
   ranger-${IMPALA_RANGER_VERSION}-admin.tar.gz \
   -C $RANGER_SRC_DIR/target/

3. $IMPALA_HOME/bin/create-test-configuration.sh

4. $IMPALA_HOME/bin/create-test-configuration.sh \
   -create_ranger_policy_db

5. $IMPALA_HOME/testdata/bin/run-ranger.sh
   (run-all.sh has to be executed instead if other underlying services
   have not been started)

6. $IMPALA_HOME/testdata/bin/setup-ranger.sh

Testing:
 - Manually verified that we could point Impala to a locally built
   Apache Ranger on the master branch (with tip being
   https://github.com/apache/ranger/commit/4abb993).
 - Manually verified that with RANGER-4771.diff and
   IMPALA-12921_addendum.diff, only 3 authorization-related tests
   failed. They failed because the resource type of 'storage-type' is
   not supported in Apache Ranger yet and thus the test cases added in
   IMPALA-10436 could fail.
 - Manually verified that the log files of Apache and CDP Ranger's Admin
   server could be created under ${RANGER_LOG_DIR} after we start the
   Ranger service.
 - Verified that this patch passed the core tests when CDP Ranger is
   used.

Change-Id: I268d6d4d6e371da7497aac8d12f78178d57c6f27
Reviewed-on: http://gerrit.cloudera.org:8080/21160
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M README-build.md
M bin/bootstrap_toolchain.py
M bin/create-test-configuration.sh
M bin/impala-config.sh
M bin/rat_exclude_files.txt
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java
M testdata/bin/setup-ranger.sh
A testdata/cluster/ranger/IMPALA-12921_addendum.diff
A testdata/cluster/ranger/RANGER-4771.diff
A testdata/cluster/ranger/README
R testdata/cluster/ranger/setup/all_database_policy_revised.json.template
A testdata/cluster/ranger/setup/impala_group_non_owner_2.json
M testdata/cluster/ranger/setup/impala_user_non_owner.json.template
A testdata/cluster/ranger/setup/impala_user_non_owner_2.json.template
M testdata/cluster/ranger/setup/impala_user_owner.json.template
16 files changed, 385 insertions(+), 16 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21160
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF

[Impala-ASF-CR] IMPALA-12921, IMPALA-12985: Support running Impala with locally built Ranger

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21160 )

Change subject: IMPALA-12921, IMPALA-12985: Support running Impala with locally 
built Ranger
..


Patch Set 17: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21160
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I268d6d4d6e371da7497aac8d12f78178d57c6f27
Gerrit-Change-Number: 21160
Gerrit-PatchSet: 17
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Sat, 15 Jun 2024 10:25:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13120: Load failed table without need for manual invalidate

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21478 )

Change subject: IMPALA-13120: Load failed table without need for manual 
invalidate
..


Patch Set 3:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10718/


--
To view, visit http://gerrit.cloudera.org:8080/21478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia882fdd865ef716351be7f1eaf203a9fb04c1c15
Gerrit-Change-Number: 21478
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Sat, 15 Jun 2024 10:20:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21452 )

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16358/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Sat, 15 Jun 2024 09:17:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-06-15 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21452

to look at the new patch set (#5).

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..

IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

With this patch IcebergDeleteBuilder checks how many probe threads
are actually blocked on the builder. Let's assume the following plan:

 UNION ALL
/ \
   /   \
  / \
 SCAN allANTI JOIN
 datafiles  / \
 without   /   \
 deletes  SCAN SCAN
  datafilesdeletes
  with deletes

In that case UNION ALL, and the two "SCAN datafiles" operators are in
the same fragment, while the builder of the ANTI JOIN is in a different
fragment. This means that "SCAN datafiles without deletes" can run in
parallel with the builder. But once that SCAN is exhausted, the UNION
ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN
operator, but that operator depends on the join builder output.

This means in some cases the SCAN fragments are busy, while in other
cases the SCAN fragments are blocked. It depends on how much work
they need to do, and how much work the build-side needs to do. So to
handle all cases, we dynamically check how many probe fragments are
blocked on the builder, then spin up as many threads to parellelize
the final sort.

This also works well when we have the following plan:

ANTI JOIN
   / \
  /   \
 SCAN SCAN
 datafilesdeletes
 with deletes

The above plan is created when all data files have corresponding
deletes, or when we are running a simple count(*) query. In that
case all "SCAN datafiles" fragments are blocked on the builder,
so we can use that many threads to sort the build results.

A new field "ThreadCountInFinalBuild" was added, so we can check the
query profile about how many threads were used for the final
sorting in the builders.

Measurements:
In a table with 1 Trillion data records and 68.5 Billion delete records
it reduced "IcebergDeletePositionSortTimer" from ~1 minute to
8-10 seconds, in an environment with 40 executors and MT_DOP=12.

Testing:
 * e2e tests that check counter "ThreadCountInFinalBuild"

Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
---
M be/src/exec/iceberg-delete-builder.cc
M be/src/exec/iceberg-delete-builder.h
M be/src/exec/join-builder.cc
M be/src/exec/join-builder.h
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-update-stress.test
M tests/stress/test_update_stress.py
6 files changed, 160 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/5
--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-06-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21452 )

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10720/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Sat, 15 Jun 2024 08:53:20 +
Gerrit-HasComments: No