congxuebin opened a new issue, #1111:
URL: https://github.com/apache/cloudberry/issues/1111
### Apache Cloudberry version
Apache Cloudberry 1.0.0+1cbab9b3 build 107428 commit:1cbab9b3
### What happened
```Finalize Aggregate (cost=0.00..47054.79 rows=1 width=8) (actual
time=209034.987..209034.987 rows=1 loops=1)
-> Gather Motion 96:1 (slice1; segments: 96) (cost=0.00..47054.79
rows=1 width=8) (actual time=201932.920..209034.987 rows=96 loops=1)
-> Partial Aggregate (cost=0.00..47054.79 rows=1 width=8) (actual
time=206733.965..206733.965 rows=1 loops=1)
-> Hash Join (cost=0.00..47054.73 rows=62500 width=8)
(actual time=183447.805..202826.102 rows=5646 loops=1)
Hash Cond: (lineitem.l_partkey = part.p_partkey)
Join Filter: (lineitem.l_quantity < ((0.2 *
avg(lineitem_1.l_quantity))))
Rows Removed by Join Filter: 56616
Extra Text: (seg30) Hash chain length 1.1 avg, 5 max,
using 181372 of 1048576 buckets.
-> Seq Scan on lineitem (cost=0.00..5759.12
rows=62499894 width=17) (actual time=349.003..13061.124 rows=62547413 loops=1)
-> Hash (cost=16685.97..16685.97 rows=549597 width=12)
(actual time=183267.742..183267.742 rows=199303 loops=1)
Buckets: 1048576 Batches: 1 Memory Usage: 17732kB
-> Broadcast Motion 96:96 (slice2; segments: 96)
(cost=0.00..16685.97 rows=549597 width=12) (actual time=136233.295..183209.742
rows=199303 loops=1)
-> Hash Right Join (cost=0.00..16673.58
rows=5725 width=12) (actual time=135929.695..181874.606 rows=2176 loops=1)
Hash Cond: (lineitem_1.l_partkey =
part.p_partkey)
Extra Text: (seg77) Hash chain length
1.0 avg, 2 max, using 2174 of 1048576 buckets.
-> Finalize HashAggregate
(cost=0.00..15623.04 rows=1519490 width=12) (actual time=135686.690..180901.587
rows=2087470 loops=1)
Group Key: lineitem_1.l_partkey
-> Redistribute Motion 96:96
(slice3; segments: 96) (cost=0.00..15428.43 rows=1519490 width=12) (actual
time=315.006..60572.201 rows=62510072 loops=1)
Hash Key:
lineitem_1.l_partkey
-> Streaming Partial
HashAggregate (cost=0.00..15371.35 rows=1519490 width=12) (actual
time=607.006..100379.954 rows=62436899 loops=1)
Group Key:
lineitem_1.l_partkey
-> Seq Scan on
lineitem lineitem_1 (cost=0.00..5759.12 rows=62499894 width=9) (actual
time=7.000..11556.110 rows=62547413 loops=1)
-> Hash (cost=756.18..756.18
rows=3688 width=4) (actual time=243.005..243.005 rows=2176 loops=1)
Buckets: 1048576 Batches: 1
Memory Usage: 8269kB
-> Seq Scan on part
(cost=0.00..756.18 rows=3688 width=4) (actual time=4.000..243.005 rows=2176
loops=1)
Filter: ((p_brand =
'Brand#23'::bpchar) AND (p_container = 'MED BOX'::bpchar))```
### What you think should happen instead
```Finalize Aggregate (cost=0.00..47795.39 rows=1 width=8) (actual
time=93851.887..93851.887 rows=1 loops=1)
-> Gather Motion 96:1 (slice1; segments: 96) (cost=0.00..47795.39
rows=1 width=8) (actual time=91491.864..93851.887 rows=96 loops=1)
-> Partial Aggregate (cost=0.00..47795.39 rows=1 width=8) (actual
time=92084.870..92084.870 rows=1 loops=1)
-> Hash Join (cost=0.00..47795.34 rows=62500 width=8)
(actual time=73693.124..92574.412 rows=5646 loops=1)
Hash Cond: (lineitem.l_partkey = part.p_partkey)
Join Filter: (lineitem.l_quantity < ((0.2 *
avg(lineitem_1.l_quantity))))
Rows Removed by Join Filter: 56616
Extra Text: (seg30) Hash chain length 1.1 avg, 5 max,
using 181372 of 1048576 buckets.
-> Seq Scan on lineitem (cost=0.00..5759.12
rows=62499894 width=17) (actual time=6.000..8541.081 rows=62547413 loops=1)
-> Hash (cost=17372.13..17372.13 rows=664851 width=12)
(actual time=73676.696..73676.696 rows=199303 loops=1)
Buckets: 1048576 Batches: 1 Memory Usage: 17732kB
-> Broadcast Motion 96:96 (slice2; segments: 96)
(cost=0.00..17372.13 rows=664851 width=12) (actual time=43455.411..73616.696
rows=199303 loops=1)
-> Hash Right Join (cost=0.00..17357.15
rows=6926 width=12) (actual time=43151.855..71598.419 rows=2176 loops=1)
Hash Cond: (lineitem_1.l_partkey =
part.p_partkey)
Extra Text: (seg77) Hash chain length
1.0 avg, 2 max, using 2174 of 1048576 buckets.
-> HashAggregate
(cost=0.00..16165.13 rows=2251934 width=12) (actual time=42906.851..70530.398
rows=2087470 loops=1)
Group Key: lineitem_1.l_partkey
-> Redistribute Motion 96:96
(slice3; segments: 96) (cost=0.00..8565.99 rows=62499894 width=9) (actual
time=0.000..12986.257 rows=62620193 loops=1)
Hash Key:
lineitem_1.l_partkey
-> Seq Scan on lineitem
lineitem_1 (cost=0.00..5759.12 rows=62499894 width=9) (actual
time=8.000..16519.156 rows=62547413 loops=1)
-> Hash (cost=756.17..756.17
rows=3581 width=4) (actual time=241.005..241.005 rows=2176 loops=1)
Buckets: 1048576 Batches: 1
Memory Usage: 8269kB
-> Seq Scan on part
(cost=0.00..756.17 rows=3581 width=4) (actual time=4.000..241.005 rows=2176
loops=1)
Filter: ((p_brand =
'Brand#23'::bpchar) AND (p_container = 'MED BOX'::bpchar))
```
### How to reproduce
Run TPC-H with 1000G with following GUCs
20250518:11:08:30:1249017 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c optimizer -v on'
20250518:11:08:31:1250098 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c optimizer_analyze_root_partition -v on
--masteronly'
20250518:11:08:31:1250194 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c gp_autostats_mode -v none --masteronly'
20250518:11:08:34:1250290 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c default_statistics_target -v 100'
20250518:11:08:36:1251354 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c gp_vmem_protect_limit -v 15000'
20250518:11:08:38:1252418 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c max_statement_mem -v 14GB'
20250518:11:08:40:1253482 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c statement_mem -v 1GB'
20250518:11:08:42:1254547 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c gp_fts_probe_timeout -v 600'
20250518:11:08:44:1255611 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c gp_fts_probe_interval -v 600'
20250518:11:08:46:1256723 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c gp_segment_connect_timeout -v 1800 -m 1800'
20250518:11:08:48:1257788 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c gp_interconnect_queue_depth -v 10'
20250518:11:08:50:1258852 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c gp_interconnect_snd_queue_depth -v 8'
20250518:11:08:52:1259917 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c gp_interconnect_min_retries_before_timeout -v
100'
20250518:11:08:54:1260980 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c autovacuum -v off'
20250518:11:08:57:1262044 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c max_connections -v 500 -m 100'
20250518:11:08:59:1263111 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c max_prepared_transactions -v 100'
20250518:11:09:01:1264174 gpconfig:systest106:gpadmin-[INFO]:-completed
successfully with parameters '-c work_mem -v 512MB'
### Operating System
Oracle Linux 9
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes, I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/cloudberry/blob/main/CODE_OF_CONDUCT.md).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]